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Abstract 

Entanglement measures based on a logarithmic functional form naturally emerge in any attempt to 
quantify the degree of entanglement in the state of a multipartite quantum system. These measures can be 
regarded as generalizations of the classical Shannon- Wiener information of a probability distribution into 
the quantum regime. In the present work we introduce a previously unknown approach to the Shannon- 
Wiener information which provides an intuitive interpretation for its functional form as well as putting all 
entanglement measures with a similar structure into a new context: By formalizing the process of information 
gaining in a set-theoretical language we arrive at a mathematical structure which we call "tree structures" 
over a given set. On each tree structure, a tree function can be defined, reflecting the degree of splitting 
and branching in the given tree. We show in detail that the minimization of the tree function on, possibly 
constrained, sets of tree structures renders the functional form of the Shannon- Wiener information. This 
finding demonstrates that entropy-like information measures may themselves be understood as the result 
of a minimization process on a more general underlying mathematical structure, thus providing an entirely 
new interpretational framework to entropy- like measures of information and entanglement. We suggest 
three natural axioms for defining tree structures, which turn out to be related to the axioms describing 
neighbourhood topologies on a topological space. The same minimization that renders the functional form 
of the Shannon- Wiener information from the tree function then assigns a preferred topology to the underlying 
set, hinting at a deep relation between entropy-like measures and neighbourhood topologies. 
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1 Introduction 



In the past two decades, two well-established disciplines within Mathematics and Physics, namely Information 
Theory and Quantum Physics, have merged into a prospective new field on their own. This process was 
instigated by advances in Atomic and Molecular Physics which opened up the possibility of controlling the 
behaviour of matter by means of laser light down to very small length scales - from the nano regime, involving 
mesoscopic systems, even further down to the control of single atoms and molecules by means of appropriate 
laser radiation. What is more, even the possibility of reducing the irradiating source down to nanoscales has 
sprung up, and controlled single-photon sources have been experimentally proved to be possible. Together 
with these advances, the possibility of utilizing nanostructures, or even single molecules, as the fundamental 
building blocks for future quantum computers, has arisen. The study of information processing in such an 
environment, its limitations as well as its capabilities to exceed classical computational processes, then has 
been coined according to the two main pillars contributing to the new discipline - Quantum Information 
Theory. 

However, it turns out that the scope of Quantum Information Theory is much wider than the range 
of its possible applications within Quantum Computation might suggest. Indeed, it has been conclusively 
demonstrated that Quantum Information Theory provides the main conceptual as well as computational 
foundation to tackle some of the basic unsolved problems within Quantum Mechanics which have haunted 
physicists for decades - the problem of Quantum Nonlocality PJE1IH], its presence signalled by the violation 
of appropriate Bell- Inequalities [1103113, an d the nature of Entanglement between distinct quantum systems. 
These questions have a strong overlap with the mystery of how a 'classical world' can emerge from a universe 
governed by Quantum Mechanics in which all superpositions of states are allowed, but still only a tiny subset 
of these, namely those which we perceive as 'classical', can be ordinarily observed: A possible answer has been 
provided by the notion of Decoherence [3 IH1 El EGE ED] — the decay of quantum correlations between systems 
which are subjected to the inevitable quantum noise from an environment which is ultimately understood to 
be the universe as a whole. Here again, the notion of entanglement emerges as a central quantity. 

One of the basic problems in Quantum Information Theory (QIT) is then to find measures for the quan- 
tification of the degree of nonclassical correlations, or entanglement, between two physical systems. So far, 
entanglement has been shown to be quantifiable in two regimes, called "finite" and "asymptotic" |121 1131 IT1| : 
The first one attempts to quantify the amount of entanglement within a single copy of a quantum state; the 
second one deals with tensor products of a large number of identical copies of a given state. Many of the 
entanglement measures proposed so far have a close relationship to the classical Shannon- Wiener information 
|15| I17| HSj of a probability distribution, which is formally identical (up to a sign) to the thermodynamical 
entropy of the same distribution. For example, the so-called "uniqueness theorem" |12l 1191 l2~U] states that, 
under appropriate conditions, all entanglement measures coincide on pure bipartite states and are equal to 
the von Neumann entropy [U - the quantum analogue of the classical Shannon- Wiener information - of the 
corresponding reduced density operators. It is this recurring fact which strongly hints at the relevance of 
entropy-like measures of information, such as the Shannon- Wiener information, both in the classical and in 
the quantum context. 

In this work we present a new and rather unexpected approach to the concept of Shannon- Wiener informa- 
tion: We show that this quantity can be understood as the result of minimization of a so-called tree function 
on a mathematical structure, called tree structure, which we define and investigate in this work. We show in 
detail that the Shannon- Wiener information, as known up to date, may be obtained as the minimal value of 
the tree function, when the condition of minimization is imposed. This puts the notion of Shannon- Wiener 
information, and in turn, all other measures of information and entanglement which are based upon it, into a 
whole new context, which is presented in this text. Although tree-like objects are known in Information Theory 
|22| . Complexity Theory and Discrete Mathematics, the framework presented here is a new and original cast of 
this theme, and differs from previously proposed concepts to such a degree that a complete and self-contained 
account of the new structure is justified; this account is given in the present text. 

We now give a brief qualitative overview over the new structure and its main properties: 



What are tree structures: A tree structure B(X) over a given set X is a subset of the power set VX of 
X which is obtained by a continuous splitting of its nodes b G B(X) into smaller and ever smaller subsets; 
this splitting is described in terms of partitions of sets. Tree structures can be defined over sets of arbitrary 
cardinality, countable or non-countable. For infinite sets, the tree structures over X are fractal-like objects 
|23j . There are three natural axioms governing tree structures, which are independent of whether the set X is 
finite or infinite. These axioms give rise to preferred topologies on the underlying set X. 

See sections |21 El 

How do tree structures arise: Tree structures arise in modelling processes of information gaining; they 
are designed to capture the operational aspect of this process. In such a model we assign a natural number 
to the outcome of an interaction between a unit that seeks to find a distinct but unknown element xq of a 
set X, and a unit that possesses this information, but renders only information about "neighbourhoods" of 
the distinct element, as these neighbourhoods zoom more and more into xq. These "neighbourhoods" can be 
given a topological meaning. 

See sections El EZI 

What are the typical structural elements: The main structural elements are the "nodes" in the tree; 
these are subsets of the underlying set X to which two characteristic numbers, the total number of elements, 
and the "degree of splitting" in the next-level partition, are assigned. Strings of such nodes can be picturized by 
"paths" in the tree structure. To every path in a finite tree structure, a natural number, called the "amount" 
of the path, can be assigned, which represents the maximal number of Yes-No-questions that are necessary 
to single out the element x for the given path in the given tree. Of central importance is the sum over the 
amounts of all complete paths in the tree; this sum will be called the "amount function". In this way, every 
tree structure over a given set X can be assigned a unique value of the amount function. The amount functions 
are related to the "tree function" which is a sum over cardinal numbers and degrees of splitting at every node 
in the tree. On certain subsets of trees, amount functions and tree functions coincide. 

See sections E HTOl HU [TJ 

The natural question concerning tree structures: Assigning a value of the tree function to every tree 
over X, we can ask on which trees the tree function takes its minimum. This question can be generalized, as 
constraints on the admissible trees can be imposed. The admissible trees then may be chosen so as to preserve 
a prescribed initial partition of X, which reflects a choice of "weights" (wi) for the path amounts in such a 
tree. This is analogous to choosing a probability distribution (pi) for the paths in the admissible trees. 

See sections d El E3 EU 123 ED 

The first main result concerning tree structures: If there are no constraints, then the minimal value 
of the tree function is close to n ■ lg(ra), where lg(n) is an integer approximation to the logarithm of n with 
respect to the basis 2. Thus, the mean value of the amounts of n paths in a complete tree over X comes close 
to lg(n), which is the information gained in finding a distinct element among n "equally weighted" elements; 
or the entropy of n distinct states, depending on the context. One of the central results of this work is that 
the functional form lg(n) of the entropy so defined is itself the result of a process of minimization, i.e. there 
is a more general functional form underlying, namely the tree function. These results can be generalized to 
the constrained case; here the terminal elements b{ G B{X) are endowed with weights w^, so that the value 
of the tree function contains expressions like n ■ lg(n) — Yl w i' ^s( w i)- Here we recognize the Shannon- Wiener 
information, or entropy, 



of the probability distribution depending on the context. Again, we have the striking result that the 
functional form of the entropy is itself the result of a process of optimization of a more general expression, 




namely the tree function, and the entropy, as usually known, is only the minimal value of this more general 
function. 

See sections 12011111123111113 

The second main result concerning tree structures: Every tree structure over a set X defines a 
neighbourhood topology 24 on X. As we vary the tree structures, so vary the topologies on X. A tree function 
on a given set of, possibly constrained, tree structures will single out preferred neighbourhood topologies, 
namely those for which the tree function becomes minimal. This defines something like an action principle for 
neighbourhood topologies on the set X, where the value of the action = tree function on the minimal trees is 
an entropy-like quantity. 

See section H3 

— The plan of this report is as follows: In section0we recall the definition of partitions of sets. In section|3 
we outline how a tree structure encodes the operational aspect of the problem of information gaining, which 
yields the concept of entropy/information. Three natural axioms describing tree structures are presented in 
section |1J In section |3 we recall some elementary facts on ordered sets; in section |3 we show how the set of all 
partitions of a given set X is a partially ordered set. The concept of subtrees is discussed in sectional while 
ideas concerning the sum, union, extension, reduction and completion of trees are introduced in section |3 
After this preparation we define paths in a tree structure in section |3 In section El we show how a tree over 
X selects a distinct subset of partitions of the underlying set X; here we introduce the important concepts of 
minimal and maximal partitions of the underlying set X in the tree B, and the number m(b) characterising 
the degree of splitting of a node in the tree. Then we come to the central notions in our theory: In section ITTI 
we introduce amount functions on sets of tree structures. The technical sectionEl contains a splitting theorem 
for amount functions. In section El we define the tree function on the set of all tree structures over X. The 
problem of minimizing trees is first taken up in section 1141 We then introduce the concept of divisions in 
section I15| and explain its relation to partitions in section 1161 In sections El an d El we introduce optimal 
divisions of sets, and the concept of optimal trees based on optimal divisions. Section El defines minimal 
classes based on prescribed divisions. Section El introduces the integer approximation of the logarithm with 
respect to the base 2, together with some of its properties. The value of the optimal amount of an optimal tree 
is derived in 1211 Section H3 introduces the notion of preoptimized trees, which is a tool of central import to 
the proof of the minimality of optimal trees. The latter result is approached in a series of propositions given in 
section H3 Section E] reflects the same statements from the point of view of the mean path amount in a tree 
over X. In section El we introduce an important notion of structural similarity, encaptured in an appropriate 
definition of isomorphism between trees. In section El we outline how to find constrained minimal trees on 
which the functional form of the tree function contains expressions like — ^ Wi lg {v)j). In the last section El 
we show how tree structures define neighbourhood topologies on X, and how the tree function selects distinct 
topologies according to a minimal principle. 

This report is based on the preprint |2fij . 
Notation convention: For the difference of two sets, which is commonly denoted as 



we shall use the notation A — B = A\B instead. 

2 Partitions 

Let A be a non-empty set. A partition z of X is a system of mutually disjoint non-empty subsets \i C X 
whose union is X, i.e. 



A\B = {x £ A\x / B) 




(PI) (J l* = X, 



(P2) mHm'=0 



The power set VX of X is the set of all subsets of X, including the empty set. That is to say, VX contains 
the elements 

, {x} for x £ X , {x, y} for x j= y £ X , . . . , X . (2) 

We see that every partition is a subset z C VX of the power set of X. 

The partition zq = {X} will be called the trivial partition. The partition z is said to be complete if every 
element fj, of z contains precisely one element of X, i.e. #/x = 1 for all /i £ z, or 



{{x} 



£ VX 



x£X\. (3) 



The set of all partitions z of X will be denoted by Z{X). The set of all nontrivial partitions will be denoted 
by Z*(X), i.e. Z*(X) = {z£ Z(X) \ z + {X} }. 

3 Movitation for tree structures 

We want to show how tree structures arise in the course of modelling processes of information gaining. We now 
describe such a model: Let X be a non-empty finite set, < n = j^X < oo. Let xo £ X be arbitrary. We want 
to find a numerical measure for the information that is gained when xq has been identified as a distinct object 
amongst n objects. Consider the interaction of two (information processing) units, the first one (storage unit) 
of which has stored the knowledge about xq, and the second unit tries to identify a^o amongst all n elements of 
X. The only knowledge permitted to the second unit (search unit) is that all n choices are equally likely. The 
search unit starts by suggesting a partition of X to the storage unit; if the number of elements in the partition 
is mi, then the search unit has to pose at most (mi — 1) yes- no questions to the storage unit in order to 
identify the element of the partition that contains xq. Next the search unit suggests a partition of the subset 
that contains xq, and so on. This gives the following scheme: On level 1, we have a partition z(X) £ Z(X) 
with #z(X) = mi elements, i.e. 

z(X) = {X 1 ,...,X mi } . (4) 

On level 2 of the emerging tree we partition all the subsets Xj in @: We decompose X\ into m2(l) non-empty 
subsets, X2 into m_2(2) non-empty subsets, X mi into m,2(mi) subsets; here the subscripts 1,2 in m\,mi 
refer to the levels 1 and 2, respectively. Hence for i\ = l,...,mi we have partitions z(Xi 1 ) £ Z(Xi 1 ) with 
cardinality #z(Xi 1 ) = 7712(^1), 

z ( X h) = ••• ^ii,m 2 (h)} ■ ( 5 ) 

Now we continue along these lines: X\ t i is decomposed into 777,3(1,1) subsets; X l ^ rrl2 ^ is decomposed into 
7773(1,7172(1)) subsets; . . .; X mi m2 ( mi ) is decomposed into 

m3(mi,m2(mi)) 

subsets, i.e. for i\ = 1, ... , mi, 12 = 1, • • • ,7772(71) we introduce a partition z(Xi 1 ^ 2 ) £ Z{Xi 1 ^ 2 ) with cardinal 
number ^z(X{ 1> i 2 ) = 7773 (71,72) such that 

^(-^1,12) = {-^W,l2,l) ■ ■ ■ ) -^Ql,i2,m3(il,l2) } ' (6) 

etc. Any of the subsets Xi x j 2 ... emerging in this process is an element of the power set VX of X. The totality 
of all these subsets is a certain subset of the power set of X which we shall term a tree structure or simply a 
tree B(X) over X. Hence, 

B(X) = {X, 

Xi,... , X mi , (7) 

)■■■■> A"l im2 (l) J • • • J X mi ^l , X mi)m2 ( mi -j } 



We see that the elements of a given tree structure B(X) can obviously be labelled by series of the form 



0, 

(l) , ■ ■ ■ , (mi) , 

(1, l) , (l, m 2 (l)) , . . • , (mi, l) , (m 1 ,m 2 (mi)) , . . . 



(8) 



If the set X is infinite, there can be series (JHJ) which extend forever. On the other hand, if X is finite, then 
each of these series is finite and can be denoted in the form (ii, • • • , i K ). In this case the cardinalities 



are natural numbers. 

Let b be a terminal element in the tree, with series (ii, . . . ,i K ); this series may be called complete if 
n(ii, . . . , i K ) = 1, otherwise it will be called incomplete. Hence, in a tree over a finite set X with n = j^X we 
can have at most n distinct, complete series. 

A tree B{X) may be called complete if all series associated with terminal nodes are complete. 

For a given set X let M{X) denote the set of all tree structures over X. The set of all complete tree 
structures over X will be denoted by C(X). Clearly, C(X) g M(X) for #A > 2. 

- We have seen how tree structures emerge naturally in processes modelling information gaining. The 
basic properties of tree structures, as they present themselves from the above analysis, will be compiled in the 
next section. 

4 Axiomatic definition of tree structures 

We now suggest three natural axioms defining a tree structure as a set of subsets of X, as motivated in eq. (J7|): 
A tree structure B(X) over X is a system of non-empty subsets b C X of X (hence a subset of the power set 
VX of X) such that the following axioms hold: 

(Al) X c B(X). 

(A2) If 6, b' G B(X), then b C b' or b' C b or b n b' = [This is "exclusive or"]. 
(A3) For all b, b' G B(X) there exists b G B(X) such that b, b' C b. 

Elements b G B{X) will be called the nodes in the tree B{X). An element b G B{X) will be called primitive, 
if b contains only one element, i.e., b = {y} for some y G X. The tree structure B(X) will be called complete 
if it contains all primitive elements, i.e., {y} G B{X) for all y G X. This definition is clearly consistent with 
the notion of completeness as given in the previous section |21 

An element b G B{X) will be called refinable if b is not primitive; hence there exists b' ^ b. If none of the 
subsets b' which refine b lie in the given tree B(X), we call the element b terminal in B(X); in this case we shall 
also use the notation b = 6g n . Thus, each node b G B(X) which is not terminal is refinable in the given tree. 
On the other hand, all primitive elements are trivially non-refinable, and hence must be terminal in B(X). 

Most of the definitions we will introduce in this work will be stated as general as possible, although our 
actual conclusions regarding the Shannon- Wiener information will be worked out on finite sets only. 

5 Ordered sets 

We recall some general definitions regarding ordered sets: 

A non-empty set X is called ordered, if a relation " -< " is defined on X, satisfying: 



n(h, . . . 



i K ) = #Aj lv .. 5 j 



(9) 



(Ol) For any two elements a, b of X either o-<6or6-<aora = 6is true. 



(02) If a < b and b ^ c then a < c. 

If the non-empty set X contains a non-empty ordered subset T, then X is said to be partially ordered. 
Hence every ordered set is partially ordered. To distinguish this case from a partial ordering we sometimes 
say that an ordered set X is totally ordered. 

If X contains an element xq for which xq -< x for all x G X is true, we call xq the principal element in X 
[or in the pair (X, -<), to be precise]. 



6 Z{X) as a partially ordered set 

On the set Z(X) of all partitions of X, a natural partial ordering " -< " can be introduced as follows: Let 
z, z' G Z{X). The relation z -< z' is defined to be true if and only if every b' G z' is contained in some b G z 
according to b' C b, and there exists b G z, b' G z' for which this inclusion is proper, b' $ 6. In this case we 
say that the partition z' is a refinement of the partition z. If both z and z' are finite this implies in particular 
that #z < #z'. 

Given two partitions z, z', clearly none of the relations z -< z' or z 1 -< z or z' = z need to be true; this is 
why the set Z(X) is only partially ordered. 

If the partition z of X is kept fixed, we can think of the set of all partitions z of X for which z is a 
refinement, z -< z; they comprise the set 

Z(X,z) = {zeZ(X)\z^z} . (10) 

7 Subtrees 

Let £>(X) be a given tree over X. Let b G The set 

B(b,X) = {b' G B{X)\b' C 6} (11) 

will be called the subtree ofb with respect to B(X). By definition, B(b,X) is a tree structure over b, and hence 
an element of A4(b). 

If b is non-refinable, and hence a terminal element in £>(X), then B(b,X) = {&} is trivial. 



8 Sum, union, extension, reduction, and completion of trees 



Let B(X) be a tree structure over X. Consider the elements X\, . . . ,X mi of level 1 in the partition z(X) of 
X, as given in eq. (jJJ in section EJ For every we can think of the subtree B(X, Xi) over Xi with respect to 
B(X). The relation of the subtrees B(X, Xi), i = 1, . . . , mi, to the "parent" tree will be described by saying 
that B(X) is the sum of the trees B(X, Xi). Now we see how to extend this definition to tree structures over 
sets which are not a priori subsets of a given set: Let m G IN, let X\, . . . ,X m ^ be non-empty pairwise 
disjoint sets, i.e., Xi D Xj = for i ^ j. Let B(X±), . . . , B(X m ) be tree structures over X±, . . . , X m . Then the 
set ^21LiB(Xi), defined by 



U 



u 



(12) 



will be called the sum of B(X\), . . . , B(X m ). By construction this is a tree structure over the set |J Xj with 

i=i 

subtrees . . . , S(X m ). 



Another construction is the union of trees. This is defined as follows: Let B{X) be a tree structure, and let 
b G B(X) be a terminal but non-primitive element. Then j^-b > 1. Although b is not further partitioned in the 
tree B(X), we can nevertheless consider tree structures over b without reference to B{X). Let B(b) be such a 
tree over b. Then we can attach B(b) to B(X) by identifying b G B(b) with b G B(X); the resulting set is the 
union B(b) U B(X), and is again a tree structure which will be called the union of the trees B(b) and B(X). 

A somewhat related, but more general, concept is the extension of trees: Let B and B' be two tree structures 
over the same set X. We will say that B' is an extension of B if B' 3 B. A special case of extension is the 
completion B c of a tree B: This is defined to be a tree structure B c over the same set X as B that extends 
B and is complete, i.e., B c contains all primitive nodes {x} as x runs through X. If X is finite, every tree 
structure over X admits such a completion; but clearly, there are many completions B c for a given tree B, 
which differ in the paths q({x}) of the terminal nodes, see section El below. 

Yet another construction is the reduction B'{X) of a tree B{X) by a subtree B(b,X); this is just the 
operation inverse to the union of trees, as defined above: If b is a given node in a given tree B(X), we can 
remove the subtree B(b, X) from B(X) by setting 

B'(X) = [B(X) - B(b)]u{b} . (13) 

The set B'(X) is a tree by construction and is obtained from B(X) by simply cutting off the branch containing 
all further partitions of b, but reattaching b as a terminal element. This contains an important 



Splitting principle Every tree B(X) can be expressed as the union of any of its subtrees B(b,X) with a 
cutoff tree B'(X), 

B(X) =B(b,X) UB'(X) , (14) 
where both trees on the right-hand side are subsets of the original tree and B'(X) is defined in eq. (fT3)l . 



9 Paths in a tree structure 

We now show that tree structures have a natural partial ordering. To this end we observe that there exist 
distinct subsets in a tree structure which can be totally ordered: Let B(X) be a given tree over X. Let 
b G B(X). Then we call the set 

q(b) = {be B(X) \bD b} (15) 

the path ofb in B(X). q(b) is certainly non-empty, since it always contains X and b itself. From the definition 
of q(b) we see that a total ordering " -< " on q(b) for all pairs of elements (b',b") of q(b) can be defined by 
setting b' < b" if and only if b' 3 b" . This makes the path q(b) a totally ordered set, for all nodes b G B(X). 
As a consequence, the tree B(X) is a partially ordered set. If q(b) is finite, its cardinality is a natural number 
which we denote by o(b), 

o(b) = #q(b) , (16) 

and which we shall call the length of the path q(b) in the tree B(X). 

Given the natural ordering of the path q(b) as defined above, we obviously have b' -< b for all b' G q(b), by 
construction of q(b). Provided that b ^ X, it follows that there always exists an element b~ G q(b) such that 
b~ -4 b but b' -4 b~ for all b' ^ b~ , b; this distinct element will be called the predecessor ofb in the tree B(X). 
Thus all nodes b G B(X) except for X have a unique predecessor in B(X); X itself has no predecessor in B(X). 
The predecessor is equal to the "smallest" node in B{X) which contains b as a proper subset; obviously, its 
own path q(b~) coincides with the path q{b) just up to b itself, 

q(b)=q(b-)U{b} , b#q(b-) . (17) 



Notation conventions: We introduce some notation conventions that will prove convenient in the sequel. 
If b G B(X) and q(b) is the path of b in B(X), then we denote 

q(b) = q(b) - {b} . (18) 



If B(b', X) is a subtree of B(X) and if b G B(b', X), then the path of b in B(b', X) will be denoted by 

qB(b',x)(b) = {a£B(b',X)\aDb} . (19) 



10 Partitions compatible with a given tree 

Consider a given node b in the tree B(X). The subtree B(b, X) defines a distinct set of partitions of b in the 
following way: Each distinct partition z is a collection z = {b^b^, ■ ■ ■} of mutually disjoint subsets b[ C b 
whose union is b such that each b[ is also an element of B(b, X). Thus, z C B(b, X). Such a preferred partition 
of b will be called compatible with the tree B{X). The set of all partitions of b compatible with the tree B{X) 
will be denoted by ((b), 

C(b) = {zeZ(b)\zCB(b,X)} . (20) 

If it is necessary to point out that the compatibility is referred to the given tree B(X) we shall also use 
the extended notation ((b,B). Similarly, we define C*(^) to be the set of nontrivial partitions in C(^)> i- e - 

e(b) = c(b) - {{&»• 

10.1 The maximal compatible partition z miLX (b) 

The set C(^) contains several distinct partitions: Firstly, the trivial partition {X}; and secondly, the partition 
of X which is constituted by the set of all terminal nodes in B(X). Similarly, if b G B(X) is arbitrary, then 
((b) contains the trivial partition {&} as well as the partition of b which is constituted by all terminal elements 
in the subtree B(b,X). The latter will be denoted by z max (b), and will be called the maximal partition of b 
in the tree B(X). Its elements are those terminal nodes 6fi n of B(X) which are also subsets of b. Equivalently 
we can say that the elements of the maximal partition z mSuX (b) of b are those terminal elements b^ n of B(X) 
whose paths q(b^ n ) contain b, 

z max {b) = {b &n eB{X)\beq(b 6xi )} . (21) 

Since the maximal partition is defined in terms of terminal elements it exhibits maximality in the following 
sense: z mauX (b) refines any other partition z of b which is compatible with B(X), 

z±z m3X (b) for all ze((b,B) . (22) 

As a consequence, the number of elements in z max (b) is greater than or equal to the number of elements in 
any other z G ((b, B), 

#z<#z max (b) for all ze((b,B) . (23) 

If b is terminal, then the maximal partition is the trivial partition, z max (b) = {&}, as follows from eq. (|21j) . 
In this case #z max (fo) = 1. 

If b is not a terminal node then z m3jX (b) G C*(^)- If b = X, then the union of all paths q(bfi n ) with 
frfin £ z max (X) renders the whole tree structure B(X), 

U q(b &n )=B(X) . (24) 

The maximal partition is related to the concept of reduced trees, sectional and the concept of the set of 
partitions ((b) compatible with B(X), in the following way: 

Theorem 10.2. Let B(X) be a given tree over X. The set ((X) of partitions of X compatible with B(X) 
is comprised of the maximal partitions z' max (X) of X with respect to B'(X), where B'(X) ranges through all 
reduced trees H1S\) associated with B(X). 

Proof: 

Let z G C(^Q> then all elements b\, bi, ■ ■ ■ of z are elements of B(X). Consider the union of paths 

\Jq(b t )^B'(X) , (25) 



where q(bi) are the paths of b, L in B(X). By construction, the right-hand side B'(X) is a tree structure, and is 
also a reduction of the original tr66 with maximal partition £ max (X,B r ) = z. Conversely, let B'(X) be 

a reduction of B(X) with maximal partition z' max (X); then all elements b' E z max (X) lie in B(X) by definition 
of a reduced tree; hence z max (X) is compatible with B(X). ■ 



10.3 The minimal compatible partition £ mm (6) 

Let 6 be a given non-terminal node in the tree. The partition of b that is obtained by stepping to the next level 
in the tree will be called the minimal partition z m i n (b) of b in B(X). The elements b' in the minimal partition 
are uniquely characterised by the feature that they all have the node b as their predecessor, and there are no 
further nodes b' which have this predecessor. We can therefore write 

z min (b) = {b' eB(X)\ (b'y = b} . (26) 

Zmin(b) has another minimal property which can be alternatively used to define it as a set: The minimal 
partition z m { n (b) of b is uniquely characterised by the fact that it is refined by any non-trivial partition of b 
compatible with B{X) 

z m - m (b)^z for all z G (^*(b,B) , (27) 
and as a consequence contains the least number of elements, 

#z min (b)<#z for all zeC(b,B) . (28) 
This follows immediately from the definition (|26|) of z mm . 

The above definition (|26|) or, alternatively, eq. (|27|) . is meaningful only when b is not a terminal element of 
the given tree B(X). If b is terminal we define the minimal partition to be the trivial partition, z m i n (b) = {b}. 
In this case #z m in(&) = 1- 

Let b be any non-terminal element of B(X). Given the minimal partition z mm (6) of b in B(X), we can split 
the subtree B(b, X) accordingly into a sum of subtrees, 

B(b,X)= Bi >'^) ■ ( 29 ) 

6'6«mm(6) 

We shall make use of this fact frequently. 

The number of elements in the minimal partition z m i n (b, B) of b in the given tree B{X) will be denoted as 

m(b) = #z min (b,B) . (30a) 

Similarly, the number of elements of b regarded as a set will be denoted by n(b), 

n(b) = #6 . (30b) 

These quantities pertain to the nodes b G B(X) in a specific way and will play a crucial role in what follows. 
We must have 

n(b)= Y, n («) • ( 31 ) 

For every b £ B(X) the following inequality holds: 

1 < m(b) < n(b) . (32) 

Furthermore, if b ^ X, then b has a unique predecessor whose minimal partition z m i n (b~) has m(b~) 
elements, one of which is just b. Each of the nodes in z m ; n (6~) contains at least one element, and there are 
m(b~) — 1 nodes apart from b; hence 

n(b~)>m(b~)-l + n(b) , (33) 

or 

m(b~) - 1 < n(b~) - n(b) . (34) 



10.4 Trees reduced by a partition 



By means of the concept of a partition compatible with a given tree we can introduce a generalization of the 
idea of reduced trees as given in eq. (fT3)) in section |HJ 

Let B(X) be a given tree over the set X. Let z £ C(X,B) be a partition of X compatible with the tree 
B(X). Then we can construct a new tree B(z) as follows: For each b £ z, we remove the subtree B(b, X) 
from B(X) but reattach b as a terminal element; this is just the proper generalization of eq. ()13j) . The set so 
obtained is again a tree by construction: 

Definition 10.5 (Tree reduced by a partition). The tree structure B(z) defined by 



B(z) 



B(X)-\jB(b,X) 



bez 



U 



Uw 



fee z 



(35) 



is called the tree B{X) reduced by the partition z S ((X,B). 



If b is an element of the reduced tree B(z), then the subtree of b in the reduced tree will be denoted by 
B(b,z). 



11 Amount functions 

From now on we explicitly assume that X is a finite set. As a consequence, the quantities m(b) and n(b) are 
always finite natural numbers. 

Let b £ B(X) with b ^ X. Then 6 _ exists, and the number of elements in z m i n (6~) is m(b~). Now we 
think of b as being distinct in the set of elements b' comprising z mm (6~). Suppose we are presented the set 
ZmiTi (fr~) = {&!,..., ^(f,-)}) as m the scenario laid out in sectional and we are asked to find out which of the 
b\ is the distinct one. Presuming that no optimized search strategy is employed we have to expend at most 
(m(6~) — 1) questions in order to fulfill our task. 

We can now extend this reasoning to the whole path q(b): b~ is distinct in the set of all b" comprising the 
minimal decomposition z m \ n (b 2 ~), where b 2 ~ denotes the predecessor of b~ in B(X). In order to determine b~ 
amongst the m(b 2 ~) elements of z mm (6 2- ) we have to expend at most (m(b 2 ~) — 1) questions. We can continue 
in this way up the whole path q(b) until no predecessor exists any longer, in other words, b k ~ = X. 

The maximum number of questions to determine the distinct node b G B(X) we were seeking out is therefore 
the sum of all these contributions, 



e(b)= Y, [m(a)-l 

aeq(b)} 

where the length of the path o(b) was defined in eq. (JI 



a£(?(fe) 



o(b) + 1 , (36) 



Definition 11.1 (Amount of a node). The quantity e(b) in eq. I.Vfij) will be called the amount of b in the 
tree B(X). 

When emphasizing the fact that the amount is dependent on the underlying tree we shall also use the 
notation eg (6). 



Remark: In eq. ()36|) . the element b is excluded from summation, since a € q(b) only. If b is terminal in 
B(X) then m(b) = 1; in this case we can trivially extend the sums in (|36j) to range over the whole path q(b), 
since the additional contribution m(b) — 1 is zero on account of m(b) = 1. It is then possible to write the path 
amount (|36]> as 

e{b) = Y^ fnifl) — 1 

aeg(fe) 



for b = b &11 e z milx (X) 



(37) 



We shall frequently make use of this convention. 

Now let z G C(X) be an arbitrary partition of X compatible with the tree B(X). Then every element b £ z 
has the uniquely defined path q(b) C B(X). Hence it makes sense to sum up the amounts eg (b) of each b: 

Definition 11.2 (Total amount of partition). The quantity G(z), defined by 

G(z) = J2^B(X)(b)=Yl E [m(a)-l] , (38) 

b£z b£z aGq(b) 

is called the total amount of z £ C(^0 with respect to the tree B(X). If z contains only one element we define 
the associated amount to be G{z) = 0. 



When emphasizing the fact that the total amount is dependent on the underlying tree structure we shall 
also denote G(z) = Gb(z). 

Now consider the maximal partition z max (X) of X in B(X): From 

Gb{x) = GOwpf)) = e B(x)(b) = 

E 

aeB(X) 

PO 

we see that in this case we sum over all but the terminable elements b £ B(X); hence the total amount for the 
maximal partition of X in B(X) is dependent on B(X) only; this is reflected in our notation. The sum Gg^x) 
therefore defines a map from the set of all tree structures over X into the natural numbers, 

Definition 11.3 (Total amount of tree. Amount function). The quantity Ggpn ^ s called the total 
amount of the tree structure B(X). The map G, as defined in eq. b40j ), is called the amount function on 
M(X). 

Remark 1: If the tree B(X) = {X} is trivial we again define the associated total amount to be zero, 
Gb(x) = 0. 

Remark 2: The total amount G(z) with respect to the partition z € C(^0 compatible with the given tree 
B(X), defined in eq. (|3*5|). is equal to the total amount GW Z ) of the reduced tree B(z), which is obtained by 
reducing B(X) via z according to definition 110.51 We can use this fact to emphasize that the total amount of 
the partition G(z) is dependent on the underlying tree structure B(X), 

G(z)^G B{z) . (41) 

Proposition 11.4 (Inequalities). For every b £ B{X) the following inequalities hold: 

o(b) - 1 < e(b) < n(X) - nip) < n(X) - 1 . (42) 

Proof: 

Set o(b) = k and q(b) = {Pi, . . . , with (3\ = X and j3 K = b. Then q(b) = . . . , and we must have 

H — 1 K 

e(&)=£[m(ft)-l] =X)[m03 7 -_ 1 )-l] . (43) 

3=1 3=2 



For all j £ {1, 1} we must have m{j3j) > 2. If this is inserted into eq. we obtain the first inequality 
in (02) . If eq. (j3U) is inserted into (g3J) we find 

K K—1 K 

e(6) < £ [nO^-x) - n(/%)] = !>(&) " E n ^ = ,^ 

i=2 j=l i=2 i 44 i 

= n(/3x) - n(p K ) = n(X) - n(b) . 
This yields the second inequality in (|42|) . The last inequality follows trivially from the fact that n(b) > 1. ■ 



12 Induced Partitions 

Let B{X) be a given tree over X. Let 6 £ B{X). For every z £ ((X) we can introduce the intersection 

a(z,b) = znB{b,X) . (45) 

cr(z, b) can be empty if all elements in z are "coarser" than b, i.e., b^b' for precisely one b' £ z, and has zero 
intersection with the rest. If er(z, b) is non-empty, it is a partition of b compatible with the subtree B(b, X), as 
we show now: 

Theorem 12.1 (Induced partitions). 

(A) // a(z, 6) ^ then a(z, b) £ ((b, B) . 

(B) Conversely, ifz€ Q{b,B), then there exists z £ C(^0 such that o~(z,b) = z. 

(C) Definition: If a(z,b) is non-empty it is called the partition of b induced by z. 

Proof: 

Assume that o~(z, b) ^ 0, then a(z, b) = {61,62, • • •}> where bi £ B(b, X). Now let b £ z — o~(z, b), then b cannot 
intersect b: For, b lies in B(X) but not in the subtree B(b, X) by assumption; hence if it intersects b then it 
must contain b properly, 6^6. But then it also contains all bi as subsets, which contradicts the fact that the 
bi, b are mutually disjoint. Thus, b Pi b = 0. It follows that the union of all b £ z — o~(z, b) has no intersection 
with b. As a consequence, the union of all bi must be equal to b, since z is a partition of X. Furthermore, the 
bi are mutually disjoint, and lie in B(b, X), from which it follows that {61, &2 5 • • •} £ C(b, X). This proves (A). 

Let z £ C{b-,B) be given. Now represent B(X) as a union B(X) = B'(X) U B(b, X) of trees as in section |H1 
where the reduced tree B'(X) is given in eq. (fT3|) . Then the maximal partition z max (X, £>') of X in the reduced 
tree contains b as an element. We now define a new partition by removing b from z max (X, £>') and replacing it 
by the set of elements in z, 

z= [z max (X,B') -{6}]ul . (46) 

The set z so defined is obviously a partition of X compatible with B(X), hence z £ C(^> B), and, by construc- 
tion, z n B(b, X) = z. This proves (B). ■ 

The concept of induced partitions is linked to the idea of refinements of partitions: 

Proposition 12.2 (Refinement of partitions). Let B(X) be a tree over X. Let z,z' £ CPO with z -< z' . 
Then 

(A) z - (zDz') ^Q). 

(B) o~(z' , b) £ C*(6) for all b £ z — (z D z') ; whereas o~(z' ,6) = {6} is i/ie trivial partition for all b £ z n z'. 



(C) 



z'-(z'nz)= (J a(z',b) 

bez-(znz') 



(47) 



Proof: 

Since z' is a refinement of z, any element b' of z' is contained in some element b of z as a subset, b' C b. zf~) z' 
contains all elements which are not partitioned under the refinement z — > z' . This means that for all b £ zDz 1 , 
z' fl B(b,X) = {&}, hence a(z',b) is the trivial partition of b. This proves the second statement in (B). On 
the other hand, if b £ z — {z n z'), then b is undergoing a proper partition under the refinement z — ► z'. This 
implies that a(z',b) £ (*(b), thus proving the first statement in (B). Since z' is refined there must exist at 
least one element of z that undergoes a proper partition, which says that z — (z n z') cannot be empty, hence 
(A). Finally, 

z> ~(z' nz) = z'n [ J = J z'n£(6,X) , (48) 

bez-(znz') bez-(znz') 

but z' n B(b, x) = a(z', b), hence eq. (JUJ) follows. ■ 

The splitting theorem describes the behaviour of total amount functions of reduced trees B(z) and B(z'), 
where z' is a refinement of z: 

Theorem 12.3 (Splitting theorem). Let B(X) be a given tree over X. Let z,z' £ C(X) with z -< z' . Let 
B(z) and B(z') be the corresponding reduced trees. Then 

G(z')-G(z)= Yl [#o-(z',b)-l]-e B{z) (b)+ Yl G Z(b,z>) ■ (49) 
bez-(znz') bez~(znz') 

Proof: 
Using eq. (|4T|) we have 

G(z') = Y e B(z')(b') = Y e B{z'){b') + Yl e B(z')(b') = 
b'&z> b'ez'nz b'ez'-(z'nz) 

= E e B(z')(b')+ Y E [m{a')-l] = 

b'Gz'Dz b'ez'-(z'nz) a'eq B(zl) (b') 

= Y e B(z')( b ') + ZS , 

b'ez'nz 

where 

ZS= E E -1] ■ (51) 

b'ez'-(znz') a'eq B(zl) (b') 

With the help of eq. (|47|1 we can split the sums in ZS further: 

zs = E E E • (52) 

bez-(znz') b'ea(z',b) a'eg B(z / ) (fe') 

Since b £ z — (z n z') and b' £ cr(z', 6), we have b £ qB(z')(b')- But 

{a £ q B (z')(b') | a' C b} = q B (b,z'){b') , (53a) 

and 

{ a' € q B{z/) (b') \ a ^ b } = g s(z) (6) , (53b) 
for all b' £ o~(z' , 6) and 6 £ z — (z n z'). Hence we can write 

qB(z'){b') = {a £ q B ^){b') \ a! ^ b} U { a' £ q B (z')(b') \ a' C b } = 
= qB{z){b) U q B (b,z'){b') ■ 



(50) 



(54) 



This yields 

ZS= E E { E [m(a')-l] + 

bez-(znz') b'e<r(z',b) a'eg B ( z )(6) 

+ [m(a')-l]\ = 

«'e? B( ,/, w (6') (55) 

= E E e ^)( 6 )+ E E e B(.',6)(^') = 

te-(znz') b'eo-(z',b) hez-(znz') b>£a(z',b) 

= E # a ( z '' b ) ■ e B(z){b) + £ £ eB(z>,b)(b') , 

bez-(znz') bez-(znz') b'£a(z',b) 

where we have used eq. for e B ^(b). We now insert ZS into eq. (|5U)) for G(z'): 

G(« = £ e s( , (6')+ £ #^(V,6)-e BW (6) + 
6'ez'nz feez-(znz') 

+ E E e B(z',fc)(&') = 

ftez-(znz') b'e<r(z',b) 

= E e B(z)(^)+ E e 13(z)(b) + 
bezDz' bez-(zDz') 

+ E [#v(z',b)-l]-e B{z) (b)+ Y °B(z',b) 
bez-(znz') bez-(znz') 

where we have used the fact that, for elements b £ zC\ z' , the amount es{z'){b) of the path of b in the tree #(2') 
is equal to the amount e B ^(b) of the path of b in the tree B(z). Then the first two terms on the right-hand 
side of the last equation combine to give 

E e B{z'){b)+ Y e B(z)(b) = Y e B(z)(b) = G{z) . (57) 
beznz' fcez-(znz') feez 

If eq. (|5Tj) is inserted into eq. ()56j) we obtain eq. . ■ 



(56) 



We see from eq. (|49|) that there are two contributions to the difference in the total amounts: The first one 
links the amounts eg( 2 )(6) of the paths q(b) of b in B(z) to the "degree of splitting" #<t(z / , b) of the set b under 
the refinement z — > z'\ the second one is the sum of all amounts G B ^ z i m of the subtrees £>(&, z') of the larger 
tree B(z'). 

Corollary 12.4. For the special (X) and z' = z max (X) we have 

G B{x) = [m(X)-l]- Y + E G ®(b,x) • (58) 

(X) fcez min (x) 

If the tree B(X) is complete then 

G B{X) = [ m(X) - 1 ] • n(X) + Y G ^,x) ■ (59) 

6ez min (X) 



Proof: 

For z = z m i n (X), z' = z max (X) we have G(z) = m(X) ■ [m(X) — 1], as follows from eq. and G(z') = Ggpq, 
since i3(z max (X)) = £>(X) is the total tree B(X). Furthermore, for b £ z — (zf) z') we have a(z' , b) = z max (b) £ 



((b), and G B(M ,) = G B ^ X) . Then eq. (0SJ) gives 



G B(X) =m(X)- [m(X)-l] + 

+ J] [# Z max(&) - 1] • e S(X )(6) + ^ G B(b,X) 

bez-(znz') v , &ez-(znz') 

=m(AJ — 1 

= [m(X)-l] -jm(X) + J] + 



(60) 



feez-(znz') 

However, in all sums we can extend the range of b to take values in z n z' as well; for such an element b, 
#z max (6) = 1, and G B n, tX ) = 0. Thus, b can be allowed to run over the whole set z = z m ^ n (X), 

G B(X) = [ m(X) - 1 ] • | m(X) + £ [ #z max (b) - 1 ] j + 

b6z m i n (X) (61) 

+ X G B(6,X) • 

The first and the third contribution in curly brackets cancel each other; thus, we arrive at eq. (|58|). 
If the tree is complete then the maximal partition z max (X) is complete, i.e., 

«max(*) = {{ Xl }, {X 2 },..., } , (62) 

where Xj are the elements of X. In this case, each of the maximal partitions z max (b) for b £ z m [ n (X) is 
complete, so that 

#z max {b) = #b for all b e z min (X) . (63) 

Consequently, 

X #W(6) = = n(X) , (64) 

bez m in(^) 

from which eq. follows. ■ 



13 The tree function Emx) 

The next theorem will be the first main statement about the properties of amount functions, in that it expresses 
the amount of a tree as a function of the pairs of numbers (n(b),m(b)) at every node b £ B(X). To formulate 
this we need to define a new quantity: 

Definition 13.1 (Tree function). Let B(X) be a tree structure over X . The tree function E B f X ) of the tree 
B(X) is defined to be 

E B (x) = Yl n(b)-[m(b)-l] , (65) 

beB(X) 

where the sum runs over all nodes in the tree. 

Theorem 13.2 (Tree function and total amount). Let B(X) be a tree structure over X . 



(A) For a general tree, 



E B(X)= n(b) ■ e B(x) (b) 

be - in.-' \ 



(66) 



(B) If B(X) is complete then 



E B(X) — Gb(X) 



(67) 



These results say that, for a complete tree, the tree function coincides with the total amount in the tree, 
whereas if the tree is incomplete, then the tree function renders a weighted sum of the path amounts eg(x)(&), 
the weights being equal to the cardinality n(b) of the terminal elements b £ z max (X) in the incomplete tree. 

Proof: 

We first prove (A) by induction with respect ton = #X: For n = 1, both left-hand side (LHS) and right-hand 
side (RHS) are zero and hence agree. 

For n = 2, there are only two possible trees: Either, B(X) = {X} is the trivial tree, in which case 
Zmax(X) = {X}, eg(x)(X) = 0, and m(X) = 1, so that again, LHS and RHS agree to give zero. Or, 
B(X) = {X, {xi}, {X2}}. I n this case, the LHS is equal to 

£ B(x) =2-(2-l) + l-(l-l) + l.(l-l) = 2 . (68) 

On the RHS, z max (A) = {{x\}, {X2}}, e,gnn(&) = 1 and n(b) = 1 for b £ z max (X), so that the RHS also yields 
2. 

We now perform the induction: We assume that eq. (|66|) holds for all possible sets X with #X £ {1, . . . , n— 
1}. We shall prove that eq. (|66ft is valid for sets X with j^X = n as well. If B(X) = {X} is trivial then 
eq. ()66(l holds trivially as before. Thus we can assume that the tree is nontrivial, which implies that the set 
X is properly split in the tree, hence #z m \ n {X) > 1. As a consequence, the cardinality of each a £ z m \ n (X) 
must be smaller than that of X, 

#a<#X = n for all a £ z min (X) . (69) 

Now we decompose X into a sum of subtrees B(a, X), where a £ z m \ n (X), as in eq. (|T2|) . Then the LHS of 
eq. (|6*S|) can be split into 

E B(x) =n(X)-[m(X)-l]+ £ ]T n(b) ■ [m(b) - l] . (70) 

062mm (X) bdB{a,X) 

For each of the sums S&es(aX) on the RHS of (JTOJ), the induction assumption applies, 

n(b)-[m(b)-l]=E B{a>x) = Yl n ( b ) ■ e B(a,x)( b ) > ( 71 ) 

beB(d,X) bS2max(a) 

where the maximal partition z max (a) of a refers to the subtree B(a, X), but is clearly the same as with respect 
to the full tree B(X). Now assume that b £ z max (a) for some a £ z m \ a (X), and consider the path amount of b 
in the full path q(b,B), 

e B (x)(b)= Y, [m(b')-l)= [m(b')-l]+m(X)-l = 

b'£q(b,B) b> 'eq(b,B(a,X)) (72) 

= es(a,x)(&) +m(X) - 1 . 

As a consequence, 

E n ( ft ) • e B{a,x) (b) = Y n ( & ) • e B{x) (b) - [ m( X) - 1 ] • #a . (73) 

be (a) be ■'-max 

If eqs. (|71l ITS*)) are inserted into the sum on the RHS of eq. 1)70(1 we obtain 

Y E n(b) ■ [m(b) - 1} = 

aez min (X) beB(a,X) 

= -[m(X)-l]. Y #«+ E E n ( b ) ■ e B(x)(b) ■ 

(X) aez min (x)be -max (a) 



However, 

J2 #a = #X = n(X) , (75) 

aez min (X) 

and 

£ £ "W- e B(x)W= £ "(&)•<*(*)(&) • (76) 

If eqs. (|751 176(1 are inserted into eq. (|74j) we obtain a contribution — [m(A) — l]n(A) which cancels the same 
term in eq. (|70j) . so that Em X ) on the LHS of eq. l[7T)|) is equal to (|76j) . which is what we have claimed in 
eq. (jfi6|h This finishes the proof of (A). 

Now we prove (B). If B(X) is complete, then n(b) = 1 for all b £ z max (A). As a consequence, eq. (|6f))) 
becomes 

^S(X) = £ e B(x) (6) , (77) 

-max 

but, according to eq. (|39*)) . the sum on the RHS of ((77)) is just by definition of G. This proves (B). ■ 



14 Minimal classes 

We now come to discuss the problem of minimizing the tree function Eg^ X ) on certain sets of tree structures. 
We will need a couple of new notions which we introduce in the sequel: 

Consider the set M(X) of all tree structures B(X) over X. To every B £ M(X) we can uniquely assign 
the minimal partition z m j n (A) induced by B on A; this assignment will be denoted by z m \ n : 
B i— > z m \ n {X, B). Given z £ Z(X), the inverse image z^ n (z) is the set of all tree structures B over X with the 
same minimal partition Zmin 

(X) of X. 

Let n = #A. For 1 < m < rt, let A4(A, m) denote the set of all tree structures over A whose minimal 
partition z m j n (A) contains m elements. Since all A4(A, m) are disjoint, this defines a partition of A'f(A), 

A4(A)= J A4(A,m) . (78) 

l<m<n 

We recall that the tree function E : Ai(X) — > IN] sends every tree over A to the sum over all n(6) [m(b) — 1], 
as 6 ranges through all nodes in the tree. We are interested in the minima of this map, as E is restricted to 
certain subsets of .M(A). We observe that it makes no sense to ask for the global minimum of E on A4(A), as 
the answer is trivial: In this case the minimum clearly is taken on the trivial tree B = {A}, since Eg = Gg = 0. 
Meaningful results are obtained, however, if we first focus on the subset of all complete trees C(A) C A-^(A); 
this inclusion is proper for ^A > 2. We write C(A, m) for the set of all complete trees with m elements in 
the minimal partition of A. On the complete trees, the tree function E coincides with the total amount G, as 
follows from statement (B) in theorem 113.21 Now we define 



and 



min(A) = min Eq , (79) 

BdC(X) 



min(A, m) = min En . (80) 

SeC(X,m) 



In fact, min(A) is a function of n = j^X only, and min(A, m) is a function of n and m only, 

min(n) = min(A) , min(n, m) = min(A, m) . (81) 

These minima exist, since all tree functions take their values in the non-negative natural numbers. Thus it 
makes sense to speak of the set of all complete trees 



Min(A) = E- 1 ( min(A) ) n C(A) 



(82) 



on which the tree function E actually takes its minimum. Similarly, we introduce 

Min(X,m) = E~ 1 ( min(X,m)) nC(X,m) . (83) 
We term Min(X) the global minimal class in C(X). Min(X, m) will be called the minimal class in C(X,m). 



15 Divisions 

Given a natural number n, we can decompose n into m terms according to n = n\ + ■ ■ ■ + n m with 1 < m < n 
in many different ways, and for values of m ranging from 1 to n. For a given m, the numbers n« can range 
between 1 and n, and the rij need not be mutually different. A decomposition of n in this form will be called a 
division ofn into m terms. We can regard it as an m-tupel u = (m, . . . , n m ) with positive integer components, 
ni > 0, such that = m - The set of all divisions of n into m terms will be denoted by U(n,m). If n is 

fixed and m varies from 1 to n, the collection of all U(n, m) defines a partition of the set of all divisions U(n) 
ofn, 

U(n)= [j U(n,m) . (84) 

l<m<n 

We introduce the trivial division uq = (n), and denote the set of all nontrivial divisions of n by U*{n) = 
U(n) - {n }. 

U (n, m) is a proper subset of 

m 

H(n,m) = {/i G R m | =n} C R m , (85) 

i=i 

which is a hyperplane in IR m whose least Euclidean distance to the origin is -7=. The element of H(n,m) 

associated with the least distance will be denoted by h; it has components h = (—,...,—). Usually, n/m 
is not integer, so that h 7^ U(n,m). However, there are always elements n of U(n,m) that come closest to 
h. The minimal distance between these elements n and h ranges between and If h coincides with a 

point in U(n,m), then n = h is uniquely defined. The bigger the distance between h and lattice points, the 
more elements n there are: If h lies at the center of a cube formed by elements of U(n,m), then there are 
2 m candidates for n, their distance from h being precisely. In this case each of the components hi lies 
exactly between two integer values, ^ ± ^ 6 Z; thus, m must be even in this case. Whenever there is more 
than one n, i.e. more than one element of U(n, m) with the same minimal distance to h, they must be related 
by permutation of components. 

There is another way to describe a division n = n± + ■ ■ ■ n m ; this is in terms of occupation numbers tk for 
all natural numbers k between 1 and n (and, in turn, even beyond), which express how often k appears as 
one of the terms rij in a given decomposition of n. Obviously, the description of a division of n into m terms 
is determined by the set of occupation numbers (£1, i2, ■ ■ •) uniquely up to permutation of the terms ni in the 
sum. Here comes the detailed definition: 

Let n 6 INI, let 1 < m < n. The n-tupel t = (£i,£2, • • •) £ No x No x • • • will be called occupation numbers 
of the division of n into m terms, if it satisfies 

n 

tk = m , (86a) 

k=l 

n 

^k-t k = n . (86b) 
k=l 

The first sum says that the number of terms in the division of n is m; the second sum is just the decomposition 
of n. Clearly, for k > n all occupation numbers tk must vanish. For this reason we will now focus on the finite 
sequences £ = (£1, £2, . . . , t n ) of occupation numbers rather than the infinite ones, so that t ranges in Nq . 

The trivial division as expressed by occupation numbers is £0 = (0, ...,0, 1), i.e., t n = 1, and all other 
components vanishing. The set of all occupation numbers of divisions of n into m terms will be denoted by 



T(n,m); the set of all occupation numbers of divisions of n will be written as T(n). The occupation numbers 
of nontrivial divisions comprise the set T*(n). Clearly, t n = for every nontrivial t G T*(n). 

The relation between divisions u and their associated occupation numbers t is as follows: Every division 
u = (ni, . . . , n m ) defines a unique n-tupel of occupation numbers k(u) = (ti, . . . ,t n ) by 

Tn 

«(«)o = ta = Sa,m ■ (87) 
i=\ 

It follows readily that this indeed satisfies l|8fi|). Furthermore, every n-tupel i of occupation numbers defines a 
unique naturally ordered division u of n by m terms, iti < U2 < • • • < u m . Now the inverse image of an 

occupation number tupel t is just the set of all divisions v! that are related to the naturally-ordered division 
u by permutation of components. Thus, every such inverse image has a naturally-ordered representative. We 
conclude that there is a 1-1 relation between naturally-ordered divisions of n and occupation numbers. 



16 Partitions and divisions 

Let n = jfcX, let z be an arbitrary partition of X, not necessarily related to a tree structure over X. Assume 
that the partition z contains m elements, m = #z, where z = {b±, . . . ,b m }. z defines a division u(z) of n 
into m terms by u = (#&i, . . . ,#6 m ). This defines the u-map u : Z(X) — > U(n), z \— > u(z). The associated 
occupation number will be written as t(z) and has components 

t (z) a ^^2 6 ^ ( 88 ) 

for a = 1, ... ,n. t(z) a will be called the a-th occupation number of the partition z. This defines the t-map 
t : Z(X) — > T(n), 2 i— > it sends every partition of X to the associated n-tupel of occupation numbers. The 
u-, i-maps are obviously surjective, since for every division of n into m terms one can construct an associated 
partition of X. 

From the surjectivity of u and t and the fact that the map z m { n sends M(X) onto the set of all partitions 
Z(X) we find U{n) = (u o z m \ n )(M.{X)) and T(n) = (to z m \ a )(M(X)), and furthermore, U(n,m) = (u o 
2min)(A^(X, m)) and T(n,m) = (t o z min )(M(X,m)). 

The distinct occupation number t m i n (X) = (to z m \ a ){B) will be called the minimal division of n = j^X in 
B(X). 

Definition 16.1 (Integer quotient). For n G No, rn G IM, let 

= { n' G N | n' • m < n } (89) 



n 
m 



denote the integer quotient of n by m. 



17 Optimal division 

Let 1 < m < n. Let = [^1 be the integer quotient of n by m; then n = v ■ m + r with < r < m. We 
construct a division of n into m terms according to 

(u,...,v,v + l,...,v + l) , (90) 

with (m — r) occurrences of v and r occurrences of (v + 1). The associated occupation number is denoted as 
i = t(n,m) = (ti, . . . ,i n ), with i u = m — r, i u+ \ = r, and t\ = for A G" {z/, z/ + 1}. Consider the inverse 
image n~ l (J) of t under k; every representative of this set will be called optimal division of n by m, and will be 
denoted by n. Obviously, the optimal divisions come closest to the m-tupel h = (^, . . . , ^) G H(n, m) C IR TO , 
where h is the element in H (n, m) with least Euclidean distance to the origin; thus, they coincide with the 



objects n introduced in sectional We observe that k 1 (t) is the set of all elements n of U(n,m) for which 
the Euclidean norm 



|»-/'||<^ • (91) 



We now prove an important lemma about optimal divisions: 



Lemma 17.1 (Optimal divisions). Let \\u\\ = y X^2=i n ? denote the Euclidean norm of an element u G R m . 

Let u = (ni, . . . , n m ) be an element of U(n, m). Then there exists a finite sequence u°,v}, . . . ,w of elements 
in U(n, m) with u° = u, = n for some n = such that 



\u°\\ > llti 1 !! > • • • > 



(92) 

and the step u a — > u a+l involves alteration of two components of u a only. 
Proof: 

Denote M = {1, . . . , m} for short. For (i,j) G M 2 , i ^ j, we define an operation Sij on elements h G R m by 

Si j (hi,...,h m ) = (hx,...,h i + l,...,hj-l,...,h m ) , (93) 

i.e., all components except hi and hj remain the same. By construction, Sij preserves H(n,m), for if h G 
H(n,m) then so is Sijh. 

- We prove the statement: Let u G U(n, m), let A = u — h. If A,- — Aj < 1 for all i,j G M 2 , then 

u = ne K - 1 (t) . (94) 

Proof of \94\l : If A = then the statement is trivial; hence assume Let A max denote the maximal 

element in {Ai, . . . , A m }. A max is certainly > 0; for, Yl = Yl u i ~ Yl hi = 0> an d there must be nonzero 
components of Aj. Our starting assumption says A max — Aj < 1, but on the other hand, Aj < A max , hence 

< A max - Aj < 1 for all i . (95) 

However, Aj — Aj = m — Uj G Z, hence the Scime must be true for the quantities A max — Aj. We conclude 
that A max — Aj G {0,1}. We have altogether m components Aj, which can take values of either A max or 
A max — 1. Suppose there are (to — r) components Aj = A max , and r components of the form A max — 1, where 

< r < m. We cannot have r = to, for otherwise none of the would, take the mctxinicil value A m ax ; thus, 
r < to. The sum over all Aj must vanish, from which it follows that TO,A max = r. An easy computation now 
gives || A || 2 = r ^ m L ■ If rn is fixed, the expression on the RHS is zero for r = and becomes maximal for 

r = y, in which case it takes the value ^. Hence ||u — /t|| < which implies that u = n by eq. ()91() . This 
proves the statement (|94|). 

— Now we prove our lemma: We describe step 1 in constructing the series (|92|) : Let A = u° — h. If u° = n, 
there is nothing to prove. If u° ^ n, we conclude from statement (|94[) that there exists a pair (i, j) G M 2 with 

1 j such that A° — A? > 1; since the left-hand side must be integer we must have, in fact, that A° — A® > 2. 
Now we define the new element u 1 = SijU for this choice of (i, j). Let A 1 = u 1 — h = A . Since h has least 
distance to the origin, it is perpendicular to the hyperplane H(n, to), whereas A , A 1 lie in this plane. Hence, 
by Pythagoras, 

\\ U n 2 = \\h\\ 2 + \\A°\\ 2 , 

i 2 Lk Li 2 (96) 

\\u || = n + A || , 

or H^ 1 !! 2 — ||n°|| 2 = IIS'jjA !! 2 — ||A°|| 2 . The last expression is just 2(A^ — A^ + 1), which must be < —2 owing 
to A° - A° > 2. Thus, 

|| u l|| 2 <|| u 0||2_ 2< || n 0||2 ^ (97) 

and only two components of u°, namely vf- and u® have been altered. This finishes step 1. In step 2 we check 
whether u 1 = n for some n; if yes, the process terminates; if no, it continues in the same manner. Since every 
step a involves a decrease of || A a || 2 by at least —2, the process must terminate after a finite number of steps. ■ 



18 Optimal trees 



A tree structure B Q = B (X) over the set X is called optimal over X, if B a is complete, and 

t(z min (b)) =i(n(b),2) (98) 

for all non-terminal elements b £ B D . This means that every node b not belonging to the maximal partition 
Zmax(X) of X is partitioned into two halves which are as close to being equal as possible, when stepping to 
the next level in the tree; and every terminal node contains only one element. The set of all optimal trees over 
X forms (for j^X > 2) a proper subset of C(X), which will be denoted by 0(X). 



19 Minimal classes in T(n, m) 

Every minimal tree B G Min(X) maps into a certain partition z under z m [ n , and into a certain occupation 
number t under the t-map. We shall be interested in the image of Min(X) under this sequence of maps, which 
we will denote as 

T min (n) = (t o z min ) ( Min(X) ) , (99) 
and which we shall call the global minimal class in T(n). For 1 < m < n, the set 

T min (n, m) = (t o z min ) ( Min(X, m) ) (100) 

will be termed the minimal class in T(n,m). 

We note that we now have several distinct classes of occupation numbers in T(n): We have the class 
containing all optimal divisions of n by m, { i(n, l),i(n, 2), . . . , i(n, n) }; and on the other hand, the classes 
T m i n (n,m). The relation between these will be investigated in the following developments. 

Now let t G T(n) arbitrary. We can study its inverse image (t o z^^y 1 (t) n C(X) in C(X). To every 
tree in this set we can assign the associated tree function E; thus it makes sense to ask on which trees 
B G (t 

° ^min) (t) n C(X), for a given division t, the tree function E assumes its minimum. This minimum 
will be denoted by min(t); hence 

min(t) = min Eq . (101) 

Settoz^j-^nCpO 

The associated subset of trees in (t o Zmin)" 1 (t) n C(X) that actually take this minimum will be denoted as 
Min(t), 

Min(t) = E~ 1 ( min(t)) n (t o z^)" 1 (t) nC(X) . (102) 



20 Bases and integer logarithm 

Let L £ IN with L > 2. Then the set 

B L = {L k \ k e N } (103) 

will be called basis over L. The set B2 we shall also call binary basis. If no confusion is likely, B2 will be simply 
denoted by IB. 

Definition 20.1 (Integer logarithm). Let n G IN be a natural number. The integer logarithm lg^(n) of n 
with respect to L is defined as 

Ig £ (n) = max j k G IN L k < n } . (104) 

If no confusion is likely, the integer logarithm of n with respect to 2 will simply be written lg(n) = \g 2 {n). 
Clearly, lg L is a monotonically increasing function on N. 

Proposition 20.2 (Properties of integer logarithm). The integer logarithm satisfies the following in- 
equalities: 



1. Let n,n' £ hi with n' > n. Then 

\g L (n') >\g L (n) . (105a) 

2. Let n,n' £ hi. Then 

lg L (n)+lg L (n') < \g L (n-n') . (105b) 

3. Let p £ Nq. Then 

p ■ lg L (n) < \g L (n p ) . (105c) 

Proof: 

lg L (n) is the maximum in the set of those integers k which satisfy L k < n. As a consequence, L lgL ( n ) < n' . 
Hence \g L {n) lies in the set of those integers k' which satisfy L k < n' and consequently must be less than or 
equal to its maximum. This proves (|105aj) . 

Let k = lg L (n) and k' = lg L (n'). Then L k < n and L k ' < n', from which it follows that L k+k ' < nn' , hence 
k + k' < lg L nn'. The converse is not necessarily true. — Furthermore, from L k < n it follows that L pk < n p , 
hence pk < \g L (n p ). ■ 

Lemma 20.3 (Standard decomposition). Every natural number n £ IM>o has a standard decomposition 

n = 2 1 sW + R , (106a) 

R < n 2lg(n) _ (1Q6b) 

Proof: 

We show that R must indeed be limited to be < n/2. Suppose to the contrary, then 2R > n for some n. 

It follows that 2n = 2 1 s( n ) +1 + 2R > 2 l ^ +1 + n or n > 2^ +1 , implying lg(n) > lg(n) + 1, which is a 
contradiction. Hence (|106a() holds. - Similarly, if R > 2 lg ( n \ then n > 2 lg ( n ^ +1 , leading to a contradiction as 

before; thus, (fl06hl is true. ■ 

Lemma 20.4. The equation 

lg(2u + 1) = lg(2i/) = lg(iz) + 1 (107) 

holds for all integers v £ N. 
Proof: 

We use the decomposition (|106a|) for 2u, 

2v = 2 1 ^ +R , < R < 2 1 ^ , v . (108) 
Assume that the first equation in (|107|) is not true, but rather 

\g(2v + 1) = \g{2v) + 1 . (109) 

We then have a chain of inequalities 

2v + 1 > 2 lg (^ +1 ) = 2 • 2 lg ( 2l/ ) = 



2 lg(2i/) +R _ R 



2-[2v- R] 



(110) 



It follows that 

^>^-i , (111) 
but R, v are integers, and therefore we must have 

R>v , (112) 



which contradicts the second inequality on the RHS of (|108j) . 

We now prove the second equation in (|1U7|> : Let k = lg(^). Then 2 k < u, but 2 k+1 > v. By multiplying 
these inequalities with a factor of 2 we find 2 k+1 < 2v, but 2 k+2 > 2v. It follows that k + 1 has the maximum 
property with respect to 2v, as required in definition I2U.11 and therefore lg(2u) = k + 1, which proves the 
second equation in l|107|) . ■ 



21 Optimal amount 

Theorem 21.1 (Amount of optimal trees). Let n = j^X and B Q E 0(X). Let lg(n) denote the integer 
logarithm with respect to 2. Then 

E Bo =G Bo = n-\g(n) + 2-[n-2^] . (113) 
This value is the same for all B a E &(X) and depends only on n. 

Definition 21.2. The common value of the amount of the optimal trees will be denoted by 

E{n) = G{n) = E Bo = G Bo . (114) 

Proof: 

By induction with respect to n. The statement is clear for n = 1, since in this case, E Bo = G Bo = 0, and 
lg(l) = 0. 

Now perform the induction: Assume that eq. ()113l) holds for all 1 < n' < n. Let X be a set with #X = n, 
let B E 0(A). Then the minimal partition z m [ n (X) = {61,^2} of X has two elements b\ and 62 whose 
cardinalities are as close to n/2 as possible. Use formula (|ST?|). together with the fact that m(X) = 2, 

G Bo = n+ Y, G ®(b,x) ■ (115) 
te min (X) 

We have jfb < n — 1 for all b E z m i n (X), and hence 

G 0{b , X ) = #b ■ Ig(#6) + 2 • [#6 - 2^# b )] (116) 

by assumption. We now must distinguish whether n is even or odd: 

Case 1: n = 2u + 1. We apply formula (|115|) . using the fact that #61 = v + 1 and #62 = This gives 

G Bo = 3n + (1/ + 1) • lg(i/ + 1) + v ■ Ig(i/) - 2 1 s( ly+1 ) +1 - 2 lg ^ +1 . (117) 

Two subcases must be considered: lg(z^) = lg(^ + 1), or lg(^) + 1 = lg(z/ + 1). Consider first the case 
lg(i/) = lg(u + 1), 

G Bo = 3n + n • Ig(i/) - 2 lg (^+ 2 . (118) 

The first two terms give 2n + n • lg(n), upon using eq. (|1U7|) in lemma 12 (J. 41 Multiple applications of the same 
equation then produce eq. (|113|) . Now consider subcase lg(v + 1) = lg(^) + 1: This is the case if and only if 
v = 2 K — 1, where K E IM>o- Therefore, K = \g{v + 1). On using lg(n) = lg(z^) + 1 we find 

G Bo = 2n + n\g{n) + v + l-2 K -2 K+1 . (119) 

But v + 1 — 2 K = 0, and hence we arrive at (|113|) again. 

Case 2: n = 2v. In this case, jfb\ = jfL>2 = f, and formula (|115|1 gives 

G Bo = n + 2u [\g{v) + 1) + 2i/ - 2 • 2 lg M , (120) 

which gives again (|113|) . on using lg(n) = lg(^) + 1. ■ 



Lemma 21.3 (Monotonicity of optimal amount). The optimal amount is a monotonically increasing 
function of n. In particular, 

G{n + 1) - G(n) = Ig(n) + 2 for all n E N . (121) 

Proof: 

The result follows directly from eq. ()113|) for both cases lg(re) = lg(n + 1) and lg(n) + 1 = lg(n + 1). H 

Using this lemma we can prove: 

Theorem 21.4 (Total amount for unsymmetric divisions). Let n E N > o- Consider two divisions of n 
into two terms, 

n i + n 2 = n 'i + n 2 ■ (122a) 

If 

{n^f + (n 2 ) 2 > n\ + n\ , (122b) 

then 

G(n[) + G{n' 2 ) > G{m) + G(n 2 ) . (122c) 

Proof: 

Eq. (J122a|) implies that some A exists such that = n\ + A and n' 2 = n 2 — A. Without loss of generality we 
can assume that A > 0. If A = then there is nothing to prove, hence we can assume A > 0. Then (|122cj) is 
equivalent to 

G(m + A) - G(n x ) > G(n 2 ) - G(n 2 - A) . (123) 



The LHS and RHS can be written as sums over differences G{n\ + A) — G{n\ + A — 1), etc. Using eq. (|121j) 
we find that (|123|) is equivalent to 

A-l A-l 

!g(^i + j) > ] §( n2 - A + •?') • ( 124 ) 

3=0 j=0 

Now a short computation shows that (|122b|) implies 

ni > n 2 - A , (125) 

Now eq. (jlU5a|) in proposition I2U . 2l shows that lg(ni +j) > \g{n 2 — A+j) for each pair of terms in (|124j) . Thus 
^TM follows. ■ 

This theorem is used in proving the important 

Theorem 21.5 (Amount of non-optimal divisions). Let u = (ni, . . . , n m ) E U(n, m), let n = (ni, . . . , n m ) 
E K~ 1 (t) be an optimal division of n by m. Then 

m m 

J]GW>^GW • (126) 
i=i i=i 

Proof: 

Clearly, the RHS of (|126|) is independent of the representative n E K _1 (t), as the representatives differ only by 
permutations of components. According to lemma H 7. II there exists a finite sequence u°, u 1 , . . . , it* of elements 
in U(n,m) with u° = u and = n for some n E K~ l (t), such that ||n°|| > Hit 1 ]! > • • • > Wu* ||, and the step 
u a — > u a+1 involves alteration of two components of u a only, namely uf +l = uf + 1 and uj +1 = v,® — 1. As a 
consequence, 

K? + K) 2 > « +1 ) 2 + (f 1 ) 2 , (127) 



whereas v% = u^ +1 for all k E" Thus, using theorem 121.41 with (|127|) implies that 

m m 

J2G(uf)>J2GK +1 ) ■ (128) 

i=l i=l 

This inequality holds for every step a — > a + 1, and hence (|126|) follows. ■ 



22 Preoptimized trees 



The concept of preoptimization is required as a necessary intermediate step in order to solve the problem of 
finding the global minimal class Min(n). Let n = #X. 

Definition 22.1 (Preoptimized trees). A tree B E M(X) is called preoptimized if every subtree B(b,X) 
of B based on elements b E z m i n (X) in the minimal partition of X in B is optimal. 

Thus, the only "degrees of freedom" of varying a preoptimized tree are the different choices of minimal 
partitions z m i n (X), where these choices can be effectively described by the set of all divisions U(n) of n into 
m terms, for m = 1, . . . ,n. Every preoptimized tree is complete by definition. The subset of all preoptimized 
trees over X in A4(X) will be denoted by p0(X) C C(X). p0(X) is comprised of the disjoint subsets p0(X, m) 
of preoptimized trees with m elements in the minimal partition z m [ n (X); hence we have a partition of p0(X) 
according to p0(X) = [j p0(X,m). Furthermore, we define 

l<m<n 

P 0*(X)= \J p0(X,m) =p0{X) -{{X}} . (129) 

2<m<n 

The structure of the subsets 0(X),p0(X),p0* (X), . . ., etc., is independent of the nature of the underlying 
set X but depends only on the number n = j^X of elements in it. We can therefore write 0(X) = 0(n), 
p0(X) = p0(n), p0*(X) =p0*(n), etc., when appropriate. 

On the subsets just described, the tree function E coincides with the total amount by theorem I1I12| since 
all trees are complete. There, E = G takes the minima 

pmm(X) = min Eg , (130a) 

0e P 0*pO 

and 

pmm(X,m) = min En . (130b) 

B&p0(X,m) 

In (|130a|) we have restricted the trees to the set p0*(X), since for the trivial preoptimized partition z m i n (X) = 
{X}, there is nothing to optimize since there are no subtrees; and the total amount G of this tree, as well as 
the corresponding tree function E, is zero. 

In accord with definitions 1)130(1 we introduce the sets of all preoptimized trees for which G = E takes the 
corresponding minima: 

M\n p (X) = M\n p (n) = E- 1 (pmm(X)) n P 0*(X) , 
Min p (X, m) = Min p (n,m) = E~ 1 (pmm(X, m) ) Dp0(X, m) . 

Obviously, (toz m \ n ){p0* {X)) = T*(n), and {toz m \ n ){p0(X,m)) = T(n,m). Hence p0(X,m) can be partitioned 
according to 

p0(X,m)= (J [(to^ min )- 1 (t)n P 0(X)] . (132) 
teT(n,m) 



In all of the discussions so far the nature of the set X was immaterial; the only thing that matters is the 
number n = j^X of elements in X. Thus, we could replace in each of the quantities above the symbol X by n. 



Now let t £ T(n), m = Ya=1 anc ^ B £ (t o z m - m ) D p0(X) such that m = #z m in(X) for the 
corresponding minimal partition of X. From eq. (|59|) in corollary 112.41 we have 

E B = G B = n(m-l)+ £ G B(fejX) . (133) 

But all subtrees X) are optimal, hence Gq^x) coincides with according to eq. (|113jl in theorem l21,ll 

and the first term n(m — 1) is constant for fixed t. Here, the common value of all optimal trees with #6 
elements in the underlying set is denoted as G(#b), according to definition 121.21 Thus Eq is constant on 
(t o z m i n ) _1 (i) Hp0(X) and hence descends to a map, again denoted by 

E : T(n) -> IN , = Eg , (134) 

for any choice of representative B £ (t o z m i n )~ 1 (t) C\p0(X). Now (|133() can be expressed as 

n 

E(t) = n(m-l) + J2 t k-G(k) , (135) 

k=l 

for all i £ T(n,m). Furthermore, we write E(u) = E(t) for any division u G K _1 (t). 

In section El we have introduced the minimum min(n, m) of the tree function on the subset of complete 
trees B € C(X) which has m elements in its minimal partition z m i n (X), where the base set is X with n = #X. 
This subset corresponds to the set T(n,m) defined in section [TBI and hence min(n, m) is the minimum of the 
descended map E, formula (|134|) . in T(n,m). The relation of the quantity min(n, m) to the preoptimized 
minimum pmin(n, m) introduced above is as follows: 

Proposition 22.2 (Preoptimized minima). Let n = j^X and 1 < m < n, then 

p min(n, m) > min(n, m) . (136) 

Proof: 

The set of all preoptimised trees p0(n,m) in general is a proper subset of M{n,m) = (to z m ; n ) _1 (T(n,m)). 
Hence, the minimum pmin(n,m) of E taken on p0(n,m) need not be the global minimum min(n, m) on 
T(n,m). ■ 

In the next section we shall compare the values E(t) with E{t) at the optimal division t £ T(n,m). 

23 Minimality of the optimal division 

In eq. (|9Ujl in section El we have defined the optimal division i of n by m terms. In this section we show 
that the preoptimized trees for which the minimal partition z m j n (X) is optimal, or equivalently, for which 
t o z m [ n (X) = i, are actually the minimal ones in M{n, m), i.e. they lie in Min(n, m). First we show that they 
are the minimal ones in the set of all preoptimized trees p0(n, m): 

Theorem 23.1 (Minimality of optimal partition 1). Let t = t(n,m) be the occupation number of the 
optimal division of n by m. Then 

E(t) > E(t) (137) 

for all t £ p0(n, m). 

As a consequence we have 

pmm(n,m) = E(t) , (138) 
and the inverse image of t in p0(X) must therefore lie in M\n p (n,m), 

(to Zmin y 1 (t)np0(X) C M\n p (n,m) . (139) 



Proof: 

Let t = (ti, . . . ,t n ) G T(n,m). Let u,n be arbitrary representatives of re (t), respectively; this means 

that u and n are divisions of n by m, u = (ni, . . . ,n m ) and n = (ni, . . . ,n m ), such that eq. Q135JI can be 
expressed as 

m 

E(t) = n(m - 1) + G(nj) , (140) 

3=1 

with a similar expression for E(t). It follows that 

m 

E{t)-E{f) = Y J {G{n 3 )-G{n j )} . (141) 
i=i 

Since n is optimal, eq. (|126j) in theorem 121.51 immediately implies that the RHS must be > 0, hence (|137[1 
holds. ■ 



Remark: The inclusion in eq. ()139|) is proper in general. This means that there exist elements in Min p (n, m) 
whose associated minimal partition is not optimal. As an example, consider n = 6, X = {1,...,6}, with 
optimal amount 

G(6) = 6 + G(3) + G(3) = 6 + 5 + 5 = 16 . (142) 

Now compare this with the complete tree B = B po (2) + B po (A), which is a sum of the preoptimized trees B po (2) 
and B po (4), respectively. B is non-optimal, since the minimal partition z m i n (X) is based on the non-optimal 
division (2,4) of 6. The fact that B po {4) is preoptimized implies that elements b G ^ m in(4) have cardinality 
#6 = 2, and hence the fact that the whole tree is complete implies that the subtrees B(b, S po (4)) are optimal. 
Thus, Gg po ( 4 ) = G(4) = 8, whereas Gs po (2) = G(2) = 2, and thus the tree B has a total amount of 

Gb = 6 + 8 + 2 = 16 , (143) 

which coincides with G(6) in eq. (|142j) even though the tree B is not optimal. 

The next theorem explains how pmin(n, m) changes for fixed n as m increases: 
Theorem 23.2 (Monotonicity of the preoptimized minimum). Let n>2 and 1 < m < n. Then 

pmin(n, m + 1) > £>min(n, m) . (144) 



Proof: 

The case m = 1 yields pmin(n, 1) = 0, whereas pmin(n, 2) = G(n) > whenever n > 2. Thus we certainly 
have pmin(n, 1) < pmin(n,2). Therefore assume now that m > 2. Let n be optimally divided by (m + 1) 



according to n = v ■ (m + 1) + r, where v 
is 



m+l 



n 



and r < m + l. The naturally ordered representative of 

(145) 

According to this decomposition we have from eq. (|138|) in theorem 123. II and eq. (|135|) that 

E(n) = pmin(n, m+l) = nm + (m + 1 — r) ■ G{y) + r ■ G{u + 1) . (146) 
Now define a new division u of n into m terms by 




u = (m, . . . , u m ) = (n 2 , • • • , n m , ni + n m+ i) 



(147) 



The value of E on any preoptimized tree whose minimal partition z min (X) corresponds to u can be computed 
using eq. ()133|) . 

E{u) = n(m - 1) + (m - r) ■ G{v) + (r - 1) • G(v + 1) + G{2u + 1) , 

r > , (148a) 

E(u) = n(m - 1) + (m - 1) • G{y) + G{2v) , 

r = . (148b) 

Assume first that r > 0: In this case we have u m = fit + n m+ \ = 2v + 1. Use eqs. (|1381 11461 1148a|) to compute 

pmm(n,m + 1) - E(u) = n + G{v) + G{y + 1) - G{2u + 1) . (149) 

However, the amount G{2v + 1) in the optimal tree B Q {2v + 1) over a set with {2v + 1) elements can be 
expressed using eq. (j59|) in corollary 112.41 as 

G(2i/+1) = (2i/ + 1) + G(i/) + G(i/ + 1) , (150) 

so that (fT4*§|) yields 

pmin(n, m + 1) — E(u) = v(m — 1) + (r — 1) . (151) 

Since r > 1 and m > 2, the RHS is > 0. - For the case r = we use eq. (|148b|) to obtain in the same way as 
above 

pmm(n,m + 1) - E(u) = n + 2G{v) - G(2v) . (152) 
However, G(2u) = 2v + 2G(y\ so that (|152|) becomes 

pmin(n, m + 1) — E(u) = v (m — 1) , (153) 

which is again greater than zero. To finish our argument we use eqs. (|1371 ITHH|) in theorem 123. 11 which imply 
that 

E(u) > pmm(n, m) . (154) 
Now eqs. (|152l - ITM|l imply the result in eq. (|144j) . ■ 

The chain of inequalities in eq. (|144|) points out that the minimum pmin(n, 2) with respect to a binary 
optimal division of the set X is the lowest in the set of all minima pmm(n, m). It follows from eq. ()129|) that 
pmin(n, 2) is therefore the global minimum in p0*(n). Hence, on using the notation ifTff 



Corollary 23.3 (Minimality of bidivisions). pmin(n,2) is the global minimum of E onp0*(n), 

pmin(n, 2) = pmin(n) . (155) 

The next theorem explains the role of the optimal trees in the present context: 

Theorem 23.4 (Minimality of optimal division 2). Let j^X = n > 2. Then the optimal trees minimize 
the tree function on the set of all preoptimized trees with two elements in z m i n (X), and hence on all preoptimized 
trees. In symbols, 

&(X) C Min p (X,2) C Minp(X) , (156a) 

and 

G(n) = pmin(n, 2) = pmin(n) . (156b) 

Proof: 

Let B G 0(X), then in particular, B Q is preoptimized, and furthermore, the minimal partition z m [ n (X) is 
optimal, i.e., (t o z m i n )(£> ) = i(n, 2). Then (|137|1 says that E(t) = G(n) is the minimum in M\n p (X, 2). As a 
consequence we must have the first inclusion in (|156a|) . and the first equality in (|156b|) must hold. The second 
inclusion in (|156aj) is a consequence of corollary 123.31 ■ 



Now we come to the main theorem of this work: 



Theorem 23.5 (Optimal trees are globally minimal). Let j^X = n > 2. Then the optimal trees over 
X belong to the globally minimal trees over X , i.e., 



0{X) c Min(X) , (157a) 

and 

G{X) = min(X) = min(n) . (157b) 

Proof: 

Since all trees involved in the present discussion are complete, the tree function E always coincides with the 
total amount G of the tree, as follows from theorem ll3,21 We prove (|157|) by induction with respect to n = j^X. 

n = 2: In this case there is only one complete tree, and hence 0(X) = Min(X) trivially. 

Induction step: We assume that 

&{n) C Min(n') (158) 

for all 2 < n! < n — 1. We prove (|157|) for j^X = n by showing that G& > G(n) for every complete tree 
B E C(X) over X. Let B G C(X), let z m \ a (X) = (bi, . . . ,b m ) be the minimal partition of X in B. Let 
u = (m, . . . , u m ) = (#bi, . . . , #b m ). Now apply eq. (|ST?|) in corollary 112.41 

m 

G B = n(m-l)+J2 G B( bj ,x) ■ (159) 

3=1 

The subtrees need not be optimal, hence assumption ()158j) implies that 

G B(bj,x) > G(uj) for all j = 1, . . . , m . (160) 

Thus, (|159|) implies that 

m 

G B >n(m-l) + J2 G ( u j) = G B' , (161) 

i=i 

where the RHS of the last formula defines the total amount of the preoptimized tree 

B'= Yl 0(b) e P 0(X,m) . (162) 

Now Ggi = E(t), where t is the occupation number of u, t = k(u); hence, by eqs. (|137M138l) in theorem l23.ll we 
must have Gqi > E{t) = pmin(n, m), where t is now the optimal division of n by m. From ()144() in theorem !23.2l 
we know that pmin(n, m) > pmin(n, 2). Using (|156b|) in theorem [223] we have pmin(n, 2) = pmin(n) = G(n). 
Thus, putting all these inequalities together, 

G B > G(n) = G(X) . (163) 

which proves the theorem. ■ 



24 Mean path amount and quadratic deviation 

From section [2*TT formula ()113j) . we immediately see that the mean path amount i Yl e i wm close to lg(n-). 
We can make this statement more precise: 

Definition 24.1 (Mean path amount). The mean path amount eg in the tree B(X) is defined to be 

eg = min { r] £ h\ | T] ■ n > G B } • (164) 



Thus 

Gb = e B -n — r , with < r < n . (165) 

In particular: 

Proposition 24.2 (Mean path amount in optimal trees 1). In an optimal tree B = B a , 

_ j lg(n) , n e B , . 

eB °-\lg(n) + l , n?B ■ (166) 

Proof: 

If n 6 IB then Hq = lg(n) follows immediately from (jl 13|) . since n = 2 lg ( n ) in this case. 
Now assume that n B. From the maximum property of lg(n) it follows that 

„ < 2 lg M +1 . This can be 

rearranged to give 

2 \n - 2 lg M < n . (167) 

If we add nlg(n) on both sides of this inequality we obtain 

n ■ lg(n) + 2 [n - 2 lg(n) ] < n [lg(n) + 1] . (168) 

However, the LHS is just Eq according to formula ()113|) . and hence 

E Bo <n[lg(n) + l] . (169) 

On the other hand, the same formula (|113|) says that 

E Bo > n ■ lg(n) . (170) 

Since n ^ B we have n = 2 lg ^ n ^ + r, where < r < 2 lg ( n ). In this case the inequality in formula 1)1 7U|) becomes 
proper, and thus lg(n) + 1 satisfies the minimum property in formula (|164jl . definition 124. II ■ 

We now come back to eq. (|165() . G& = (eg ■ n — r), where r < n. Now let us define the n-tupel 

(ei, . . . ,e n ) = ( eg,. . . ,e B , e B - 1,. . . ,e B - 1 ) • (171) 



Thus, for any tree B over X (which need not be optimal) we have 

n n 

G B = ^ei = ^ei = (n-r)-e B + r- (e B -l) , (172) 
i=l i=l 

where we have used eqs. ()165l I17l|) . Introducing the n-tupel of deviations 

Ae = (Aei, . . . , Ae n ) = (ei - ei, . . . ,e n - e n ) , (173) 
and the total quadratic deviation in B(X) by 

n 

^ot(£) = E( Ae *) 2 ' ( 174 ) 



i=l 



we find on using (|172|) that 



a 2 



£(e, 2 -e?)+2 E Ae ^ • ( 175 ) 



tot 

i=l i=n—r+\ 



We now present some statements about the mean path amount in optimal trees. In every tree B, the 
elements bx, . . . , b K in the maximal partition z ms , x (X) can be labelled so that the associated path amounts are 
monotonically decreasing, e\ > e<i > • • • e K . In particular, if e Bo is the mean path amount in the optimal tree 
B as defined in eq. (fl"64l . then we have: 



Theorem 24.3 (Mean path amount in optimal trees 2). Let G(n) be the amount of the optimal tree B Q 
with n = jfcX. Let r = n ■ e& — G(n). Then, if n G IB, 



for all i = 1, . . . , n, whereas for n E>, 



Hence, in any case, 



e; = lg(n) (176a) 

lg(n) + l , i = l,...,n-r 
lg(n) , i = n — r + 1, ...,n 



ei = ei , i = l,...,n , (177) 
where the tupel (ex, ... , e n ) was defined in eq. \17l\j . 

Proof: 

We first prove ()176a|) by induction with respect to \g{n): For n = 1,2, corresponding to \g{n) = 0, 1, ()176aj) 
is trivially satisfied. Now choose k = lg(n) > 1 and suppose that (|176a|) is true for k — 1. From formula (|36j) 
we know that the amount e(b) of a path in any optimal tree is equal to #q(b) — 1, since m(b) = 2 for all 
non-terminal elements, while m{b) = 1 for all terminal ones. Thus, the amount e(b) of any terminal element is 
1 plus the amount of the same element in the subtree B(a,X), where a G z m \ n {X) and b C a. By assumption, 
n = 2 k , hence the minimal partition of X contains two elements a± and 02 both of which must have the 
same cardinality #ai = #02 = 2 k ~ 1 . Let b be any terminal element in the tree such that b C a±, say. Then 
e B(a,x)(b) = A; — 1 by assumption. In the full tree, the path length of the same element b is greater by just 
one, hence eg (b) = k = lg(n), which confirms (j!7fiaj) . 

Now assume n B. We first show: The path lengths ei can mutually differ at most by ±1, 

\ei — ej\ G {0, 1} . (178) 

We prove this statement by induction with respect to n: For n = 1 and n = 2 the path lengths in the optimal 
trees are and 1, respectively, and hence ()178|) is satisfied. For n = 3 the path amounts in the optimal tree 
are e\ = e2 = 2 and es = 1; again, (|178|) is satisfied. Now let n > 4 and assume that statement (|178|) is true 
for all 1 < n' < n — 1. Let b,b' be any two elements in the maximal partition -z max of X. Let a, a' denote 
those elements in the minimal partition z m i n (X) for which b C a and b' C a' (this includes the possibility that 
a = a'). Then #a, ^0! < n, and the induction assumption applies to the path amounts in the optimal subtrees 
B(a,X) and B(a',X): Namely, since e(6) = ejs{ a ,x){b) + 1 and e(6') = eB{ a ',x){b') + 1 we must have 

e(b)-e(b')\ = \e Bia>x) (b)-e B{a>x) (b')\e{0,l} . (179) 

This proves formula ()178|) . 

From formula ()178|) we infer that there exist integers a and k such that 

a(k + 1) + (n - a)k = G{n) , (180) 

where < a < n. If a were zero we would have G(n) = nk, and together with eq. it would follow that 



n lg(n) + 2 — k 



By means of prime number factorisation of the factors on the left-hand side we conclude that n must take the 
form n = 2 K for some integer K, thus implying n G E>, which contradicts the initial assumption. Hence we 
really have a > 0. Now eq. (|180[) can be written in the form 

G(n) = n-k + a , < a <n . (182) 

If n and G(n) are given numbers we can consider ()182l) as an equation for the unknowns a,k. If the restriction 
< a < n is upheld, then the solution for the pair (a, k) is unique. Now consider formula (|113j) for G(n), 

G{n) = n ■ lg(rc) + 2R , R = n- 2 lg(n) . (fTTSj) 



From formula ()106b|) in lemma 12 (J. 31 we know that 2R < n. Thus, the pair 

a = 2R , fc = lg(n) (183) 

is the unique solution to the system (|182|) , As a consequence, amongst the ej there must be (n— 2R) occurrences 
of lg(n) and 2R occurrences of lg(n) + 1, 

lg(n) + l , i = l,...,2R 

lg(n) , i = 2R+l,...,n ' { ' 

It remains to show that n — r = 2R: To this end we write down formula 1)165(1 for the case at hand, i.e., an 
optimal tree with n B, in which case (|166|) applies, 

G(n) = []g( n ) + l] -n-r . (185) 

Comparison of 1)185(1 with formula 1)113(1 gives 2i? = n — r, hence ()176b(l is proved. ■ 



25 Isomorphic trees 

In this section we formulate a notion of structural similarity between trees B and B' which no longer need to 
be defined over the same set X. This will lead to an appropriate notion of isomorphism of trees. 

Consider a tree B over a set X. The structure of the tree B is captured in the set of its nodes b, and 
the degree of splitting m{b) associated with each node. The particular nature of the underlying set X, just 
as the particular value n(b) of the cardinality of the nodes, is not a primary structure-determining element. 
To see this we can construct different trees from the given tree B which exhibit the same structure: To this 
end consider the maximal partition z max (X, B) = {ci, . . . ,ck} of X in B. The elements q £ z max (X, B) are 
terminal in this tree and are never partitioned further; this means that their "internal structure" is immaterial, 
as far as the tree B, and its internal structure, are concerned. Now consider any collection {c^, . . . ,c' K } of 
non-empty, mutually disjoint sets d { with i = 1, . . . , K , and let X' = |J i d { . We can think of constructing a 
new tree B' by replacing every terminal element Cj in the old tree B' by the corresponding element c^. Then 
there is a 1-1 relation between nodes b £ B and b' £ £>'; moreover, the minimal partitions z m { n (b' , B'(X')) and 
z m i n (b, B(X)) are the same for all nodes b and b' which correspond to each other. In particular, the degree of 
splitting m(b') is the same as m(b) for such nodes. Obviously, both trees have the same cardinality, = j^B. 
This idea can be made precise in the following 

Definition 25.1 (Isomorphism of trees). Let B andB' be trees over sets X and X' with the same cardinality, 
= j^B 1 . B and B' are isomorphic if there exists a bijection i : B — > B' such that 

m o i(b) = m(b) for all b £ B . (186) 

For a given pair B and B' of trees there can exist more than one isomorphism. 

Proposition 25.2 (Paths in isomorphic trees). Let i : B — > B' be an isomorphism of trees 
assignment b i— > q(b) commutes with i, 

q o i = i o q 

As a consequence, isomorphic trees have the same path amounts, 

e iB (i{b)) = e B (b) for all b £ B . (187b) 



Then the path 
(187a) 



Proof: 

Since isomorphic trees have the same basic structure in the sense that they have the same number of nodes, 
and each node has the same degree of splitting, the paths in isomorphic trees have the property that 



q (i(b)) = i{q{b)) for all b £ B 



(188) 



which gives (|187a[) . Furthermore, m o i = i o m by definition of isomorphism. Then the statement (|187b|) 
follows immediately from eq. (|36j) in definition lll.il ■ 



For a given tree B we can think of the category [B] of isomorphic trees. Even if B is defined over a finite 
set X, general elements of [B] need no longer share this property; all that is required is that they have the 
same number of nodes and the same degree of splitting m(b) as the original tree B at each node b. From 
proposition 125.21 we learn that all trees in the category [B] have the same path lengths o(b) = #q(b), and the 
same path amounts e(b). In general they differ in the total amount Eq, however. 

There exists a stronger form of isomorphism which can be defined for trees which are built over the same 
base set X: 

Definition 25.3 (Equivalent trees). Let B and B' be trees over the same base set X . B and B' are said to 
be equivalent if they are isomorphic and share the same maximal partition of X, 

z max (X,B) = z max (X,B') . (189) 



Let us assume that X is finite. Then there exists an integer K such that z max (X, B) = {ci, . . . ,ck} and 
z max (X, B') = {c^, . . . ,c' K }. Equivalence of B and B' then means that there exists a permutation it of K 
elements such that 

c'j = c^y) for all j = 1, . . . ,K . (190) 

But B and B' are isomorphic, hence c'j = i(cj), where i : B — » B' is an appropriate isomorphism. By eq. (|187b|) . 
the path amounts are related by es'(c^) = es{cj) = ej. Let Wj = n(cj); then it follows from eq. (|oT))) in 
theorem 113.21 that the following statements are true: 

Theorem 25.4 (Tree function on equivalent trees). Let B and B' be equivalent trees over the finite set 
X . Then 

K 

E B = Y^ Wj ■ ej , 

i=i 

K 

Eb' = uv(j) • ej , 

5=1 



(191) 



where ir is a permutation of K objects. 



Each category [B] contains preferred elements S which we can construct as follows: Let B G [B], and 
consider the maximal partition z max (X,B) = {c±, . . . , ck} as before. Now define the set X$ = {1, . . . , K}. S 
is now defined to be a tree over Xs, isomorphic to B, and is obtained by replacing every terminal element Cj 
in the maximal partition z max (X, B) by the one-element set {i}. More generally, every node b = \J i i Ci K 
is replaced by the set {ii, . . . , i K }. By construction, the tree so obtained has the same number of nodes as B 
and has the same degree of splitting at each node. However, everything about the internal structure of the 
terminal elements q in the maximal partition of B has been stripped away, so that the tree now incorporates 
nothing more than the inherent structure which is shared by all trees in the category [B]. It is then befitting 
to call such a tree S a "skeleton" of B, and hence a skeleton in the respective category. The defining criterion 
of a skeleton is the fact that all terminal elements are one-element sets, i.e., that S is complete: 

Definition 25.5 (Skeleton). A tree S £ [B] which is complete is called a skeleton in the category [B]. 

Thus, it is the skeletons which embody the inherent structure in the category [£>]. 

Proposition 25.6 (Tree function on isomorphic trees). Let B be a tree over X. Let S be a skeleton in 
the category [B]. Then 

E B = Yl n(c) ■ e s {i{c)) , (192) 
where i : B — > S is the associated isomorphism. 



Proof: 

From eq. ()187b|) in proposition 125.21 we know that e#(c) = es(i{cj) for all c G z max (X, B); if this is inserted 
into eq. in theorem 11X21 (fTH^ft follows. ■ 

It follows that the tree function E, when restricted to the category [B], takes its minimum on the skeletons 
S G [B], since for these, n(c) = 1 for all c G z max (^s,5). 

26 Restricted minimal problems 

In the previous sections we have solved the problem of minimizing the tree function E on the set of all 
unconstrained complete trees over the set X. By unconstrained we mean that no conditions on the possible 
trees B over X were imposed other than requiring that B must not be trivial. We now investigate how to 
extend the framework we have worked in so far in order to obtain tree functions that contain expressions like 
^2 Pi log 2 (pi) in the functional form of their minimal value, when restricted to certain classes of tree structures 
over X. 

Amongst the countless ways to constrain the set of admissible trees we shall consider the following two 
cases only: For a given partition z of the base set X we first study the set of all trees preserving the partition 
z; and then, the set of all trees containing z. 

26.1 Trees preserving a partition 

A complete tree has a maximal partition of X which is complete, i.e., the elements of z max (X) are comprised 
by the one-element subsets {x} for x £ X. Trivially, every partition z of X preserves z max (X) in the sense that 
z max is a refinement of every partition z of X. We now generalize this reasoning to the case where z max (X) 
is no longer complete: We want to prescribe a partition z of X such that the relation z' ■< z is true for all 
z' G C(A^, B) compatible with B. In particular, for the maximal partition of X in B we must have z max (X) ^ z. 
If such a relation is true we shall say that the tree B(X) preserves the partition z. In general, the prescribed 
partition z that is preserved by the admissible trees B need not be an element of £(X, B) itself; in this case it 
induces a non-trivial partition on at least one of the elements b G z max (X) which are terminal in B, so that 
the resulting refinement z of z max (X) is compatible with the resulting extension of B. Alternatively, we can 
have z = z max (I); in this case, z is the most refined partition compatible with the tree B. These ideas lead to 

Definition 26.2 (z-preserving, z-complete trees). Let B G M.{X), let z G Z(X) be a partition of X. B 
is called z-preserving if V ■< z for all z' G C,{X,B). B is called z-complete if z max (X,B) = z. 

It is clear that, without further conditions, it makes no sense to ask for the minimum of E on the set of 
all z-preserving trees over X, as the answer is trivial: If z is given, the minimum is taken on the trivial tree 
B = {X}, since the trivial tree preserves every partition. And even if this trivial solution is excluded, then the 
tree function E takes its minimum on any binary non-trivial partition of X which preserves z; the associated 
value of the minimum can be inferred immediately from eq. ()66|) to be E = n. 

However, a meaningful minimal problem can be given on the smaller set of z-complete trees, which we shall 
denote by C(z). The minimum of E on C{z) will be denoted by m.va. + (z). The subset of all trees in C(z) on 
which E actually takes the minimum will be written as Min + (z); it coincides with the set E~ l ( min + (z)) nC(z). 

In the present work we shall not attempt to solve this minimal problem; however, we provide a necessary 
condition which arises in the course of its study: 

Proposition 26.3. Let z = {ci, . . . ,ck} be the common maximal partition in C(z). Let B G Min + (,z). Then, 

Wi < Wj => e, > ej , (193) 



where Wk = n{cy.), and e& are the path amounts of c^ in B. 



Proof: 

Let B £ Min_|_(z) with path amounts e k , k = 1, . . . , K . Prom eq. (|66[) in theorem 113.21 we know that 

K 

E B = Y J Wk-e k . (194) 
k=i 

Assume that there exist i ^ j with Wi < Wj such that < ej. We define a new tree B' by the statements: (1) 
B' is equivalent to B; and (2) the maximal partition z max (X, B') = {c[,..., c' K } of X in B' is such that 

c' k = r(i,j)c k , (195) 

where r(i,j) is the transposition of i and j. The path amounts are the same by definition of equivalence, 

e-B'ipk) = e B (c k ) = e k , (196) 

hence the tree function on B' takes the value 

E B > = w k ■ e k + wj ■ ei + Wi ■ ej . (197) 

k^i,j 

It follows that 

E B > ~E B = ( Wj - Wi)( ei - ej ) < , (198) 

and hence E B > < E B , which contradicts the minimal property of B. Thus, the initial assumption was wrong, 
and implication (|193j) must be true. ■ 



26.4 Trees containing a partition 

Another construction is the set of trees containing the partition z: The idea is that we can constrain trees by 
requiring that all admissible trees contain the elements of a prescribed partition; this leads to the 

Definition 26.5 (Trees containing a partition). Let z £ Z[X). The tree B is said to contain the partition 
z if 

z C B(X) (199a) 
a £ B(X) for all a £ z (199b) 

is true. 

Without further conditions, the minimum of E will always be taken on a tree for which the prescribed 
partition z coincides with the maximal partition in this tree; for, any further splitting, beyond the nodes a £ z, 
can only increase the value of E. It follows that there are two possibilities for meaningful minimal problems: 
(1) We require that, for all admissible trees, the maximal partition z max (X, B) agrees with z; or, (2) we require 
that all admissible trees are complete. Clearly, case (1) agrees with the minimal problem on the set C(z) of all 
z-complete trees as discussed in the last paragraph after definition 126.21 the minimum of E is min + (z), and 
the set of all trees on which the minimum is taken is Min + (z). Case (2) defines another meaningful minimal 
problem which nevertheless can be traced back to case (1): Suppose that B is an admissible tree with respect 
to case (2); then B can be regarded as the completion of a reduced tree B' , where B' is an element in the set 
C(z) of z-complete trees. It is then clear that minimal trees with respect to case (2) are those for which the 
reduction B' is minimal in C(z), in other words, B' £ Min + (z), and for which the subtrees B(bi,X), bi £ z, are 
optimal. The minimal value of the tree function E in case (2) then will be a sum of mm + (z) and another sum 
over expressions Wi\g(wi) + 2[wi — 2 lg( - Wi ^], where Wi = n(6j), bi £ z, 

E min = min + (z) + { w i ■ Um) + 2 ■ [u>< - 2^)] j . (200) 
i=l ^ ' 



Here we have again assumed that #z = K. 



26.6 Trees with a prescribed minimal partition z max (X) 



Another minimal problem can be obtained by prescribing the minimal partition z m i n (X) = z of X and 
demanding that all trees in this class be complete. We shall denote the associated class of trees by C-(z). The 
minimum of E taken in C-(z) will be written as min_(^), while the subset of C-(z) on which this minimum is 
actually taken will be denoted by Min_(z); the latter coincides with the intersection E~ l ( min_ (z)) n C-(z). 

The solution to this minimal problem is readily found: Let us suppose that the minimal partition of X is 
prescribed to be 

Zmin(X) = z = {a, . . . ,c K } , (201a) 
Wi = n(ci) , i = l,...,K . (201b) 

Since all admissible trees are complete, by eq. ()67|) in theorem II IS. 21 the tree function E on M-(z) coincides 
with the total amount function G. Eq. Q59|) in corollary 112.41 then implies that 

K 

E B = n(K-l) + J2GB( Ci ,x) , (202) 

8=1 

for all B £ M.-(z). It follows that Eq becomes minimal if all subtrees B(ci,X) become optimal; in other 
words, if 

Gb(<h,x) = G(wi) foralH = l,...,K . (203) 
But the values of the quantities G(wi) are given in eq. (|113j) of theorem 12 1.11 

G( Wi ) = Wi ■ ]g(wi) +2.[ Wi - . (204) 

On inserting (|203( I204JI into (j202j) we have proven: 

Theorem 26.7 (Minimal problem on C-(z) ). The minimum of E on C-(z) is equal to 



min_ (z) = n(K - 1) + | w t • lg(^) + 2-[ Wi - 2 lg ^] | . (205) 
The set Min_(z) contains those trees which are sums of optimal trees over the quantities Wi, 

K 

B = Y^ Bo(ci) => B £ Min_ (z) . (206) 
i=l 

We can rewrite the result (|205(l in such a way that probability-like quantities ^ £ R appear: 

— min_(z) — lg(n) = (K — 1) + 
n 

K K (207) 

+ y^- \]g(wi) - lg(n)l + - V 2 • \ Wi - . 

i=l i=l 

The quantity [lg(^j) — lg(n)] is evidently an approximation to log 2 (^), so that the right-hand side of (|207|) 
contains an integer approximation to the Shannon- Wiener entropy with respect to the "probabilities" pi = 
where i = 1, . . . ,K. 



27 Tree structures and neighbourhood topology 

Finally, we want to put forward arguments to show how tree structures define a neighbourhood topology on 
the underlying set X. We now allow the set X to have arbitrary cardinality; in particular, X can be non- 
countable. We recall that the path q(b) of a node b £ B{X) was defined to be the set of all elements b' in B 
containing b as a subset. We now extend this definition so as to speak of the path of any single element x £ X 
in the base set: For every x £ X there exists precisely one terminal element b x £ B such that x £ b; we can 
then decree that the path of the element x in the tree B be the path of the associated terminal element, and 
this assignment will be unique. Thus, 



Definition 27.1 (Path of points in base set A). Let B G A4(X) be a given tree over the base set X . Let 
x G X , let b x be the uniquely determined terminal node in B which contains x as an element. Then the path 
of x in B is defined by 

q(x)=q({b x }) . (208) 

If the degree of splitting m(b) remains finite at every node b G B, the path of x will be a countable subset 
of the tree B. 

Proposition 27.2 (The path of points). Let B be a given tree over X. Then 

b G q(x) . (209) 



Proof: 

Let x £ X and assume that b G q(x). There exists a unique b' G z max (A, B) such that x G b' . Thus, 
q{b') = q(x), and therefore b D V 3 x, which proves the implication from left-to-right. 

Conversely, let b' be the unique terminal element in z max (X,B) such that x G b' . Then x G b implies that 
bCib' 7^ 0. Now, axiom (A2) in section 0] implies that either b ^ b' or b' C b. The first inclusion cannot be true 
since b' is a terminal element; thus, b' C b, hence it follows that b G q{b') = q{x). ■ 

We now show that the given tree structure B over X defines a neighbourhood topology on X. We recall 
|25| I24j that a neighbourhood topology N assigns a collection N(x) of distinct subsets N of X to every point 
x in X; J\f is just the collection of all N{x). The elements N G Af(x), which are subsets of X, are called 
neighbourhoods of x in the topology Af, if they satisfy the axioms [23] 

(Nl) If N is a neighbourhood of x, then x G N. 

(N2) If N is a subset of X containing a neighbourhood of x, then A is a neighbourhood of x. 
(N3) The intersection of two neighbourhoods of x is again a neighbourhood of x. 

(N4) Any neighbourhood N of x contains a neighbourhood M of x such that iV is a neighbourhood of each 
point of M. 

The pair (X,Af) is then called a topological space. Furthermore, a frase for the neighbourhoods at x is a set 
Bas(x) of neighbourhoods of x such that every neighbourhood A of x contains an element b G Bas(x). Now 
we define the path q(x) to be a neighbourhood base for x, and a subset N C A to be a neighbourhood of x if 
and only if there exists a b G g(x) that is contained in A. The result is indeed a neighbourhood topology on 
A: 

Theorem 27.3 (Trees and neighbourhood topology). Every tree structure B(X) over X defines a neigh- 
bourhood topology on A. 

Proof: 

If A is a neighbourhood of x in our sense then it contains an element b G q{x) and therefore contains x as an 
element, even if the path q(x), or the tree B, does not contain {x} as an element; thus, (Nl) is satisfied. (N2) 
is fulfilled automatically by our definition. Let A and A' be two neighbourhoods of x; then they both contain 
elements b and b' of the same path q(x), and hence at least one of the relations b' D b or b D b' is satisfied. We 
can assume without loss of generality that the latter is the case; then the intersection of A and A' contains 
b' and hence is a neighbourhood of x, thus (N3) is satisfied. Finally, let A be a neighbourhood of x; then A 
contains some b G q(x), which itself is a neighbourhood of x. Then, for every y G b, b lies in the path of y, as 
follows from proposition 127.21 Hence b is a neighbourhood of y. Consequently, A is a neighbourhood for each 
y G b. This shows that (N4) is satisfied. ■ 



It is clear that the set of all possible tree structures over the given set X may be constrained in many 
different ways, for example, by imposing the conditions discussed in section 126.11 On each constrained set of 
trees, the tree function E will take a minimum, which is an entropy-like quantity, and will single out those trees 
on the constrained set on which the minimum is actually taken. The associated trees then define preferred 
topologies on the underlying set by means of the construction given above. We see that this looks distinctly 
like an action principle for topologies on the set X, the role of the action being played by the tree function, 
the degrees of freedom being expressed by the different trees over X, and the minimal value of the action=tree 
function E being associated with an entropy-like quantity. 

28 Summary 

We have presented a comprehensive account of a new mathematical structure, called tree structure, which 
arises in the formalisation of the operational aspects of information gaining. It was shown that a given set of 
tree structures can be endowed with a tree function whose value is related to the maximal number of yes-no 
questions which are necessary to identify a given node in the tree. The question of minimality of the tree 
function on these sets of trees can be posed. It was shown that, on unconstrained trees, the minimal value 
of the tree function is related to the dyadic logarithm of the number of elements in the base set; whilst, on 
constrained sets of trees, the tree function takes minima whose functional form is similar to the Shannon- Wiener 
information, or entropy, of a probability distribution. We have presented three natural axioms governing tree 
structures. It was subsequently demonstrated that these axioms can be related to the axioms describing 
neighbourhood topologies on a given set. As a consequence, every tree structure defines a neighbourhood 
topology on a set. The minimisation of a tree function on a set of tree structures over a base set then opens 
up the possibility to obtain preferred neighbourhood topologies, namely those which are related to minimal 
trees over the given base set. This phenomenon has the distinct flavour of an action principle, distinguishing 
certain preferred neighbourhood topologies by means of a minimal principle. 
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